nature 


THE INTERNATIONAL WEEKLY JOURNAL OF SCIENCE 


The epigenetic 

trait that can 

— ruin an oil 
palm crop 


PAGES 466 &533 


Weed out Bad Karma 


Pac xen ee 


OLD TAX DUAL-USE USEYOUR- sasem 
FILES WANTED FUEL CELLS 


IMAGINATION 39D 
Making administrative data A smart way to boost Sci-fact as an 
accessible and anonymous electrical power grids entrée ntosci-fi 
PAGE 447 SX 9 


STEM CELLS Curtain comes WORLD VIEW End the 47 DECEPTION Orchid shape 
ED ITO Rl ALS down on scientific unfair racket of academic and smell fools amorous 
controversy p.426 jobs for the boys p.427 \ wasps p.429 


In the name of beauty 


The ugly truth is that the plastic microbeads found in many skin scrubs and other personal-care 
products are a serious pollutant of the marine environment. They should be phased out rapidly. 


so? Why, she says, she uses Aveenos Positively Radiant skin- 
brightening daily scrub for “naturally beautiful results”. 

What is not clear from this advertisement is that the “gentle 
exfoliators” in the product promoted by Jennifer Aniston are minuscule 
beads of plastic. When Aniston, or those she inspires to follow her, 
rinse the scrub down the drain, many of the beads end up in the sea, 
where they will persist indefinitely. This is unnecessary, damaging and 
must stop. 

Others agree, and the face scrub, along with hundreds of other 
products, including toothpastes, may not be long for this world. On 
10 September, the California Legislature sent a bill (AB 888) to the 
state’s governor, Jerry Brown, that would ban the inclusion of spheres of 
polyethylene, polypropylene and other plastics less than 5 millimetres 
across in personal-care products after 2020. 

If signed into law, the bill will prevent trillions of plastic beads from 
being rinsed down the drain. Not all of these make it to the sea — waste- 
water treatment plants can sift out 90% of them — but the problems 
caused by the remaining millions are considerable. (Meanwhile, beads 
trapped in ‘sludge’ at the plants do not disappear. Plenty are sprayed on 
crops, from where they escape to rivers and lakes.) 

In a paper published on 3 September, aquatic-health researcher 
Chelsea Rochman at the University of California, Davis, and her col- 
leagues estimate that 8 trillion microbeads per day are emitted into 
aquatic habitats in the United States alone (C. M. Rochman eft al. 
Environ. Sci. Technol. http://doi.org/7sw; 2015). 

The beads are more pernicious than mere litter. Roughly the size of 
many plankton species, they are eaten by marine creatures. One study 
in 2014 saw them consumed by several taxa of zooplankton, including 
mysid shrimps, copepods, rotifers and ciliates (O. Setala et al. Environ. 
Pollut. 185, 77-83; 2014). Some of these are then eaten by larger crea- 
tures, and toxic chemicals in the plastics, as well as other toxic chemi- 
cals that adhere to plastic particles, accumulate in fish — which might 
end up on our dinner tables. 

California would not be the first place to pass a microbead ban, but 
as the world’s seventh- or eighth-largest economy, its move would carry 
weight. Just as in automotive-fuel efficiency standards or flammability 
requirements on furniture, where California goes, other places in the 
United States and elsewhere follow. The California bill is also stronger 
than many before it. It does not include a common loophole allow- 
ing for the use of ‘biodegradable’ beads — which are unlikely to truly 
degrade anywhere except in an industrial composter. 

California legislators have made the right call, but the phase-out 
period is too long. No luminous complexion is worth the wholesale 
pollution of Earth’s oceans. Consumer-goods giant Unilever says that 
it has already removed microbeads from all of its scrubs and washes. 
And there are plenty of well-tested alternative exfoliants, including nut 
shells, sand and sugar. So why wait five years to stop polluting? 


A beautiful woman comes into focus. What makes her skin glow 


While bans and phase-outs slowly take effect, the Beat the Microbead 
campaign, funded by Dutch non-governmental organizations the Plas- 
tic Soup Foundation and the North Sea Foundation, has created an app 
for consumers who want to avoid contributing to the problem. A few 
clicks can confirm whether the tempting scrub in the pharmacy aisle 
contains the beads. This is helpful in the short term, but ultimately the 
onus of responsibility should not be on the consumer. 

Microbeads are not the only source of 


“No luminous microplastic in the oceans. Tiny plastic pel- 
complexion lets used in making plastic items spill into the 
is worth the sea; plastic bags and bottles break down over 
wholesale time. On almost any beach on Earth, the sand 
pollution of carries tiny, bright grains of plastic. 


And macroplastics remain a serious prob- 
lem. A study published last month estimated 
that around 90% of seabirds have plastic in their bellies (C. Wilcox et al. 
Proc. Natl Acad. Sci. USA http://doi.org/7dv; 2015). Some birds mistake 
shopping bags for jellyfish; others confuse cigarette lighters and pen 
caps with prey and fly home to feed them to their chicks. 

The consequences of this ubiquitous plastic for marine species, 
marine ecosystems and human health remain areas of active research. 
But the public and policymakers need not wait for detailed results 
before taking action. Banning microbeads will not solve the plastic- 
pollution problem, but it is an easy start. Jennifer Aniston and the 
millions of other people who wash their faces with plastic can still look 
radiant without feeding their skincare regime to copepods. The alterna- 
tive is to forever blush with shame. m 


Earth’s oceans.” 


Power play 


The replacement of mitochondria does not 
signal ethical problems. 


the inviting prospect that some devastating diseases could be 

treated. Conditions caused by natural mutations might be 
avoided by judicious genome tinkering to set things right for the next 
generations. But ‘inviting’ does not always mean ‘advisable. 

The United Kingdom last week released new draft guidelines for 
one such treatment — mitochondrial replacement (see go.nature. 
com/thcouy). The guidelines are scheduled to come into force next 
month, when clinics in Britain will be allowed to offer the treatment. 
Not everybody agrees that this inviting idea is advisable. As such, it is 
timely to consider the ethical and technical matters at stake. 


(Siem advances in genetic and stem-cell technologies raise 
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At issue is not the nuclear genome, which contains the blueprint of 
an entire organism, but the genomes in our mitochondria — the small, 
energy-generating organelles in most of our cells. The often-overlooked 
mitochondrial genome contains only a few dozen genes, but it deserves 
as much respect as its much larger room-mate, which contains some 
20,000. The impacts of an unfortunate mitochondrial mutation range 
from an inability to exercise hard to very serious, albeit rare, diseases. 

Mitochondrial replacement involves replacing diseased mitochon- 
dria with fresh, healthy ones. This requires involving a third person 
beyond the parents — a woman to donate an egg to the process that 
contains only healthy mitochondria (hence ‘three-person embryo). 

The procedure does not alter the mitochondrial genome. But on 
the basis of animal experiments, some biologists claim that foreign 
mitochondrial genes might interfere with the expression of the nuclear 
genome in unpredictable, and perhaps dangerous, ways (see page 444). 

These concerns were brought up during the consultation process 
with scientists and the public carried out by the UK Human Fertilisa- 
tion and Embryology Authority (HFEA) before the UK Parliament 
voted in February in favour of allowing the procedure. Far from being 
rushed, as some claim, the consultation was done over many years 
and was judged as a fair public-engagement exercise by independent 
experts who monitored the process. 

The HFEA believes that the problems seen in organisms such as flies 
and mice would not be repeated in humans — in the main because they 
have not shown up in children of mixed-race couples in which the mito- 
chondrial DNA of the mother and the nuclear DNA of the father are 
likely to be the most distant. This point helps to address ethicists’ worries 
that unanticipated problems in children born following mitochondrial 


replacement could be passed on through the generations. 

Other ethical concerns about the UK move can be summarized as 
anxiety over a possible slippery slope to full-scale germline manipula- 
tion to address a broader range of conditions. These concerns are height- 
ened by advances in gene-editing techniques such as CRISPR/Cas9. 

Last week’s release of the HFEA regulations should dispel fears of 
a slippery slope. Applications are narrow and oversight is strict. The 

agency decided to allow mitochondrial 


“The HFEA replacement only to avoid serious diseases, 
° and not for the attempted treatment of infer- 
regulations a hae Ff 
should dispel tility. (Some clinics in Canada have offered 
the procedure in the belief that a shot of fresh, 
fears ofa 


young mitochondria may somehow invigor- 
ate eggs from older women, but there is little 
scientific evidence for this.) 

The regulations explicitly exclude the editing of the nuclear or 
mitochondrial genome. Licences will be given only to centres whose 
competence has been approved, and even then, these centres will have 
to seek separate approval for each patient. Licensed centres will be 
obliged to put a process in place to monitor the clinical follow-up 
of children born following mitochondrial donation, providing that 
parents agree. 

Scientists estimate that the number of women likely to be eligible for 
the procedure will be around 150 per year in Britain and about 800 in 
the United States, where the Institute of Medicine is carrying out a 
similar consultation for the US Food and Drug Agency, which will be 
responsible for licensing it. The United Kingdom has made an advis- 
able step forward that serves as a useful invitation for all to follow. m 


slippery slope.” 


STAP revisited 


Reanalysis of the controversy provides a strong 
example of the self-correcting nature of science. 


episodes in recent years: the now-retracted discovery of a 

claimed new way to reprogram cells, stimulus-triggered acqui- 
sition of pluripotency (STAP). On our website we publish two Brief 
Communications Arising (BCAs) that relate to the retraction. And on 
page 469 we publish a related Review on pluripotency. 

One BCA details the efforts made by many laboratories to reproduce 
the STAP phenomenon without success (A. De Los Angeles et al. Nature 
http://dx.doi.org/10.1038/nature15513; 2015). The other presents the 
results of a genomic analysis of the claimed STAP cells, performed as 
part of a 2014 investigation by Japan’s RIKEN institute but not previ- 
ously published (D. Konno et al. Nature http://dx.doi.org/10.1038/ 
nature15366; 2015). Using sequencing-based approaches, this analysis 
shows that all of the claimed STAP cell lines were contaminated with 
embryonic stem cells, and that this contamination affected the results. 
De Los Angeles and colleagues’ BCA also includes an analysis of 
sequencing results published in the original papers, and reaches similar 
conclusions regarding contamination. 

The Review, written by a collaboration of leading scientists who 
work with pluripotent stem cells, offers a state-of-the-art summary of 
the field, and provides a checklist that researchers can use to determine 
whether a cell has pluripotent capacity. 

Why is Nature publishing these pieces? The main reason is to update 
the scientific record. The wording of the STAP retraction notices 
left open the possibility that the phenomenon was genuine. It said: 
“Multiple errors impair the credibility of the study as a whole and we 
are unable to say without doubt whether the STAP-SC phenomenon 
is real.” The two BCAs clearly establish that it is not. 


Te week, Nature revisits one of the most controversial scientific 
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It is also important to recognize and highlight the community- 
driven effort to reproduce the findings. The negative results of some 
of these efforts were made public informally during the controversy, 
but for some lengthy experiments this was not possible. Science-in- 
the-making can be made public immediately. But, ultimately, repro- 
ducibility efforts should be peer reviewed. 

Another reason why Nature has chosen to publish this trio of pieces 
is to address some of the indirect questions posed by the high-profile 
controversy, which provoked discussions in both the stem-cell field 
and the broader research community. The Review, in particular, is 
intended to offer guidance from the community to help researchers, 
editors and reviewers to decide how best to evaluate future claims as 
well as how to view those already published in the scientific literature. 
Comparing the genotypes of reprogrammed pluripotent stem cells 
with those of parental cells, it points out, can check their provenance. 

The stem-cell field holds enormous promise for therapy. Asa result, 
all claims of considerable importance should be verified with utmost 
care before being made public. The Review suggests that such claims 
in the field of reprogramming and pluripotency should be demon- 
strated in more than one experimental model, and encourages their 
independent replication. 

Nature will endeavour to help the field to achieve its promise, 
and is looking at ways to support and encourage this reproduc- 
ibility enterprise. For example, we ask authors to include more 
details about the methods developed in their studies. We strongly 
encourage our authors to deposit step-by-step protocols on freely 
accessible platforms, such as Protocol Exchange (www.nature.com/ 
protocolexchange) — this may be requested for extraordinary 
claims, at the editor’s discretion. We encourage our authors to verify 
the origin of the cell lines they use, as we do for cancer cell lines (see 
Nature 520, 264; 2015). 

The Review concludes: “Science is ultimately a 
self-correcting process where the scientific com- 
munity plays a crucial and collective role.” In this 
case, the stem-cell community has excelled in 
that role and should be congratulated. = 


> NATURE.COM 

To comment online, 
click on Editorials at: 
go.nature.com/xhunqv 


© 2015 Macmillan Publishers Limited. All rights reserved 


WORLD VIEW  jennisicnssen 


science and research positions. This is true even in Denmark, 

which has long been considered one of the most advanced 
societies when it comes to gender equality. Although stories of sex- 
ism in science often focus on explicit bias, more-subtle factors are 
widely influential too. 

Universities like to think of themselves as meritocracies. Indeed, one 
of the arguments used against programmes that aim to proactively pro- 
mote the careers of women scientists is that scientists must be recruited 
ontalentalone. When criticized over the appointment of (another) male 
scientist to a senior role, universities often respond by pointing to rules 
and policies about how vacancies invite all to apply. 

I carried out an analysis that raises some troubling questions about 
how closely universities follow these principles 
(M. W. Nielsen Sci. Public Policy http://doi. 
org/7q6; 2015). In the decade to 2013, about 
one-fifth of associate- and full-professor posi- 
tions at Aarhus University, one of the largest 
in Denmark, were filled through a ‘closed’ 
recruitment procedure: no advertisement 
and usually just a single applicant. The share 
of female candidates for such positions is par- 
ticularly low — just 12% of applicants for full 
professorships were women. 

With ‘oper recruitment, the proportion 
of female applicants for full-professor roles 
rises to 23%. But a significant proportion still 
attracted only a single applicant, suggesting 
that the adverts were being written to tar- 
get a specific candidate, or timed to fit their 
career progression. Evidence suggests that similar practices, to various 
extents, are common at other Danish research institutions and abroad. 

Despite institutional efforts to make recruitment more robust and 
transparent, and a 2008 Danish ministerial decree that “professor- 
ships must be advertised internationally, except under special cir- 
cumstances’, my analysis shows that the use of closed recruitment 
at Aarhus University increased from 8% of tenured appointments in 
2004—08 to 30% in 2009-13. 

Such appointments do not usually break any rules; loopholes can be 
exploited in most cases. But the numbers suggest a lack of institutional 
commitment to overarching organizational and legal stipulations. And 
they confirm what most academics may already suspect: rising in the 
ranks is not a question merely of what you know, but of who you know. 

This puts women ata particular disadvantage. Academic advance- 
ment through back-door hiring largely depends 


I: is well known that women are under-represented in senior 


on reputation and visibility to the local gatekeep- NATURE.COM 
ers, and women lose out under such procedures _ Discuss this article 
for two reasons. online at: 


First, women have been shown to have _ go.tiature.com/mzlydg 
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ACCUMULATE 


OVER TIME 
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REINFORCING 


EFFECTS. 


Make academic job 
advertisements fair to all 


Too many university posts are given to men without proper competition, 
says Mathias Wullum Nielsen. 


weaker personal ties to the core of the concentric circles of academic 
networks, making them less visible to decision-makers. Second, 
scholars have argued that male decision-makers’ desire for organi- 
zational certainty and their attraction to candidates with whom they 
share values and behaviour, create subtle and often unconscious 
practices of ‘male closeness’ and ‘gender homophily’ (preference 
for someone similar to oneself). 

Gender scholars have previously shown that discrimination may be 
particularly prevalent in organizations that pride themselves on being 
meritocratic. Strong institutionalized beliefs in meritocracy are more 
likely to discourage people from paying attention to their own implicit 
biases and prejudices. 

Sure enough, department heads seem unaware of the implications. 
I interviewed 24 at Aarhus about whether and 
how issues of gender influence their recruitment 
and selection practices. I was told frequently that 
“gender doesnt play any part’, “for us it’s all about 
getting the best” and “what we look at is quality”. 

As Liisa Husu, a gender-studies researcher at 
Orebro University in Sweden, has pointed out, the 
myriad disadvantages facing female academics 
often operate as “non-events” (Nature 495, 35-38; 
2013). Women are not being taken into account, 
encouraged or asked along to the same extent as 
their male colleagues. Seen as separate occur- 
rences, such non-events may seem harmless. But 
just as academic success will often accrue to the 
already successful, marginal disadvantages accu- 
mulate over time through self-reinforcing effects, 
with clear implications for gender stratification. 

With an interest in addressing the gender-equality challenge, 
Aarhus University provided the data for this study. It acknowledges 
my findings, and is currently working ona plan to improve the situa- 
tion, which should be announced soon. 

People who use the word meritocracy as a positive depiction of soci- 
ety are probably unaware of its original satirical and pejorative con- 
notations. It works only if everyone has the opportunity to compete. 

If we really believe in meritocracy as the main principle for sorting 
academics into positions, we must become better at focusing on the 
subtle and unconscious gender biases enmeshed in our day-to-day work 
activities. At stake is not just women’s participation in science, but also 
the stature, integrity and legitimacy ofa scientific system renowned for 
its conformity to the meritocratic ideal. Although all researchers in prin- 
ciple should be considered equal, some remain more equal than others. m 


Mathias Wullum Nielsen is a postdoctoral fellow at the Danish 
Centre for Studies in Research and Research Policy at Aarhus 
University, Denmark. 

e-mail: mwn@ps.au.dk 
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How the sponge 
got its skeleton 


Sponges build their skeletons 
using specialized cells that 
transport and assemble 


structural beams like 
construction workers — a 
novel way of producing a 


skeleton compared to other 
animals. 

Sponge skeletons are 
made of rod-like silica 
structures called spicules 
that are cemented to rocks 
and to each other with 
collagen. To find out how 
the spicule assembly process 
works, Noriko Funayama 
at Kyoto University in 
Japan and her colleagues 
studied a freshwater sponge 
(Ephydatia fluviatilis) under 
a microscope and discovered 
‘transport cells’ that move 
spicules inside the sponge. 
The cells then push the 
spicules through the animals’ 
outer surface to raise them up 
and attach them together. 

This process allows sponges 
to adopt a huge variety of shapes 
and sizes, the authors say. 
Curr. Biol. http://doi.org/7sr 
(2015) 


Sound switches 
on worm cells 


Ultrasound has been used to 
stimulate individual brain cells 
in a worm. If the technique 
works in mice, it could be a 
less invasive way of studying 
specific neurons. 

Neuroscientists currently 
implant probes into animal 
brains to stimulate cells that 
have been engineered to 
become sensitive to light. 
Sreekanth Chalasani at the Salk 
Institute for Biological Studies 
in La Jolla, California, and his 
colleagues instead introduced 
a pressure-sensitive protein, 


PLANETARY SCIENCE 


Global ocean on Enceladus 


Beneath an icy crust, Saturn's moon Enceladus (pictured) has 
an ocean that covers its entire globe. 

NASAs Cassini spacecraft measured wobbles in Enceladus’s 
rotation over more than seven years. The data confirm that 
the crust is moving separately from the rocky core, meaning 
that there must be a widespread layer of liquid between them, 
says a team led by Peter Thomas of Cornell University in 


Ithaca, New York. 


Cassini had previously spotted jets of liquid spewing from 
the moon’s surface, and other studies have suggested that 
Enceladus has an underground sea only near its south pole. 
This latest finding further highlights how Enceladus could be 
one of the most likely places for extraterrestrial life. 


Icarus http://doi.org/7rf (2015) 


TRP-4, into neurons in the 
nematode Caenorhabditis 
elegans. They then put the 
worms in a Petri dish that was 
partially submerged in a water 
bath and sent a short burst 

of ultrasound into the dish, 
delivering mechanical signals 
to TRP-4 to activate certain 
neurons. 
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By adding the TRP-4 protein 
into neurons with different 
functions, the researchers were 
able to make free-crawling 
worms reverse direction, 
stop reversing or make 
more-frequent sharp turns 
in response to a brief pulse of 
ultrasound. 

Nature Commun. 6, 8264 (2015) 


| NEUROSCIENCE 
Electric zaps help 
spinal-cord rehab 


Electrically stimulating a 
damaged spinal cord as part 
of rehabilitation therapy may 
enhance improvements in 
movement. 

Steve Perlmutter at the 
University of Washington in 
Seattle and his team bruised 
the spinal cords of rats to 
partially paralyse the animals’ 
forelimbs. They then used a 
neural-computer interface 
connected to the limb muscles 
and spinal cord to direct an 
electrical pulse to just below the 
damaged spinal area whenever 
the device detected activity in 
the weakened muscles. 

Rats that received pulses 
for several weeks recovered 
their ability to reach for and 
grasp food pellets with their 
forelimbs to a greater extent 
than those that did not receive 
pulses. The stimulated rats 
maintained their recovery 
even after the stimulation 
was stopped, suggesting that 
it induced lasting changes in 
the spinal cord. The scientists 
suggest that the approach 
might also work in the clinic. 
Proc. Natl Acad. Sci. USA 
http://doi.org/7q4 (2015) 


Ancient lung parts 
found in fish 


A fish species found in the 
Indian Ocean has a vestigial 
lung, suggesting that its 
ancestors had working lungs 
before they shifted to life in 
deep waters. 

The coelacanth fish 
Latimeria chalumnae is 
descended from ancient 
coelacanths that lived in 
shallow waters. Paulo Brito at 
Rio de Janeiro State University 
in Brazil and his colleagues 
studied the fish at different 
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stages of development, and 
found that a lung developed 
early in the embryo but then 
slowed its growth as the 
embryo matured. As the lung 
shrank in size relative to the 
growing embryo, a fatty organ 
that helps fish to control their 
buoyancy developed. 

This is further evidence that 
ancestral coelacanths could 
breathe air in shallow waters, 
and that they lost the use of the 
lung as it was replaced by the 
fatty organ — an important 
adaptation to the deep ocean. 
Nature Commun. 6, 8222 (2015) 


A balmy climate 
on exoplanets? 


Certain planets outside our 
Solar System could have 
wind patterns that produce 
habitable climates. 

Ludmila Carone at the 
University of Leuven in 
Belgium and her team used 
climate models to investigate 
atmospheric temperatures and 
wind patterns on planets with 
Earth-like atmospheres. The 
chosen planets closely orbit 
red-dwarf stars and always 
face their stars with the same 
side. The team found 3 possible 
climates for planets that have 
orbits of less than 12 days. 

Two of these climates could 
potentially host life, because of 
wind jets that stop the side of 
the planet exposed to the star 
getting too hot. 

The findings could help 
to guide the selection of 
exoplanets for future study, the 
authors say. 

Mon. Not. R. Astron. Soc. 453, 
2412-2437 (2015) 


Orchid shapes 
trick male insects 


Orchids have 

adapted the shape of 

their flowers to attract 

pollinating wasps. 
These flowering 

plants lure male 

insect pollinators by 

producing chemicals 

that mimic the pheromones of 


their female counterparts, but 
the effect of flower shape on 
pollinators has been unclear. 
To look at this, Marinus 

de Jager and Rod Peakall 

at the Australian National 
University in Canberra studied 
two species of Chiloglottis 
orchids that emit the same 
pheromone and the two 
species of Neozeleboria wasps 
that pollinate the flowers. 
They found that the wasps 
copulated more frequently 
and for longer periods of time 
(pictured) with the orchid that 
they normally pollinate. 

The dimensions and colour 
of the preferred orchid’s 
callus (the central part of the 
flower) closely resembled the 
respective female wasp, and 
the overall shape of the flower 
allowed the male wasp to fit 
better within it. 

Funct. Ecol. http://doi.org/7rd 
(2015) 


Ecological impact 
of crops drops 


The environmental impact of 
maize (corn) and cotton crops 
on US freshwater ecosystems 
has been decreasing over the 
past decade, mainly because 
of the use of genetically 
modified plants that require 
less added pesticide. 
Sangwon Suh and Yi 
Yang at the University 
of California, Santa 
Barbara, assessed the local 
environmental impacts of 
crops, including pollution 
from direct runoff of 
fertilizers and pesticides, as 
well as from processing and 
transportation. They found 
that the impact of maize and 
cotton has decreased by about 
50% over the past decade. 
However, the impact of soya- 
bean crops has increased 
threefold, owing to the 
spread of an invasive 
soya-bean pest 
and a consequent 
rise in the use of 
insecticides. 
The authors 
say that further 
improvements 
may be more 
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Popular topics 
on social media 


SOCIAL SELECTIO 


Acall to deal with the data deluge 


As the number of biomedical research papers continues its 
relentless growth, the quality and credibility of science is 
buckling under the weight of all the data. That is the conclusion 
ofan article in the journal eLife that triggered discussion 
online this week. The piece, which is based on interviews 
with 20 anonymous US senior scientists, suggests a radical 
rethinking of the peer-review system to deal with the ‘overflow’ 
of data. Erik Millers, a cell biologist at the Karolinska Institute 
in Stockholm, summed up the issue on Twitter: “Too many 
journals, too many researchers, too low quality: Overflow in 
#science and its implications for trust.’ But not everyone was 
on board. “Is there really an overflow 


> NATURE.COM problem in science? I don't think so, 
For more on tweeted Savraj Grewal, a cell biologist at 
popular papers: the University of Calgary in Canada. 


go.nature.com/4seski  eLife 4,e10825 (2015) 


difficult, because pests and 
weeds are beginning to 
develop resistance to the 
pesticides produced by the 


them with genomes from 
people of European or Han 
Chinese descent. They found 
that the Inuit genomes were 


modified crops. enriched for genes that convert 
Environ. Res. Lett. 10,094016 certain fatty acids in the diet 
(2015) into more biologically active 
forms, and that counteract the 
oxidative stress associated with 
= a high-fat diet. The team also 
How Inuit genomes discovered a mutation in the 
have ad apted Inuit genomes that is linked to 
the development of brown fat 
The genomes of indigenous cells, which generate heat. 
people in Greenland These mutations seem to 


(pictured) show how they have 
adapted to thousands of years 
of frigid temperatures and a 
diet that is rich in fatty seafood. 
Rasmus Nielsen at the 
University of California, 


date from at least 20,000 years 
ago, when Inuit ancestors 
lived around the Bering Strait 
between Russia and Alaska. 
Science 349, 1343-1347 (2015) 


Berkeley, and his colleagues > NATURE.COM 
analysed the genomes For the latest research published by 
of 191 Inuit people from Nature visit: 


Greenland and compared www.nature.com/latestresearch 
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SEVEN DAYS nescnnss 


Ebola armoury 

On 14 September, the US 
Biomedical Advanced 
Research and Development 
Authority (BARDA) awarded 
pharmaceutical giant Johnson 
& Johnson US$28.5 million to 
manufacture two components 
ofa candidate Ebola vaccine, 
which is in phase I] clinical 
testing. On 17 September, 
ZMapp, a cocktail of three 
monoclonal antibodies that 

is marketed by LeafBio in 

San Diego, California, was 
granted fast-track approval 
by the US Food and Drug 
Administration. And on 

21 September, BARDA 
awarded up to $38 million to 
Regeneron Pharmaceuticals 
of Tarrytown, New York, 

to develop a monoclonal- 
antibody drug against Ebola. 


EVENTS 


Refugee scientists 
As Germany struggles to 

cope with an influx of tens 

of thousands of — mostly 
Syrian — refugees, German 
universities and research 
organizations are wooing 
talented scientists among 
them. Migrant scientists 

and researchers at German 
institutions can register at 
www.chance-for-science.de, 
an online platform created 

by academics at Leipzig 
University to connect highly 
qualified refugees with German 
scientists. Meanwhile, the 
Fraunhofer and Max Planck 
societies have announced a 
joint initiative to integrate 
refugee scientists into institutes 
run by the two organizations. 


Malaria milestone 
A key target in reducing deaths 
from malaria has been met, the 
World Health Organization 
(WHO) and the UN children’s 
society UNICEF announced 
on 17 September. Since 2000, 


Another low for Arctic sea ice 


This summer, Arctic sea ice reached its fourth- 
lowest coverage since satellite records began in 
1979, the US National Snow and Ice Data Center 
(NSIDC) in Boulder, Colorado, announced on 
15 September. The minimum surface area of 
Arctic ice for 2015 is 34% less than the average 


when the UN established 
malaria reduction as one of 
its Millennium Development 
Goals, malaria incidence has 
dropped by 37% worldwide. 
Malaria deaths have fallen by 
60%, saving 6.2 million lives. 
But the agencies warn that the 
battle is not over: more than 
200 million new cases have 
occurred in 2015 alone. The 
WHOs latest roadmap calls 
for a further 90% reduction in 
malaria cases and deaths by 
2030. 


The end of STAP 


Controversial claims that 
embryonic-like cells could 
be produced by exposing 
adult cells to acid have been 
laid to rest in two papers this 
week (A. De Los Angeles 

et al. Nature http://dx.doi. 
org/10.1038/nature15513 
(2015) and D. Konno et 


430 | NATURE | VOL 525 | 24 SEPTEMBER 2015 
© 2015 Macmillan Publishers Limited. All rights reserved 


al. Nature http://dx.doi. 
org/10.1038/nature15366; 
2015). Two 2014 Nature papers 
that made the claims about 
‘stimulus-triggered acquisition 
of pluripotency’ (STAP) 

cells were retracted. The 
follow-up papers report that 

7 labs failed in 133 attempts to 
produce STAP cells. Genetic 
tests found several cases in 
which purported STAP cells 
created at the RIKEN Center 
for Developmental Biology in 
Kobe, Japan, were matches for 
pre-existing embryonic stem 
cells. 


Orion approval 
NASA has approved the next 
stage ofa project to builda 
spacecraft that could send 


astronauts to Mars and beyond. 


The 16 September approval 
came after a review of the 
project, dubbed Orion. NASA 


annual area for 1979-2000 (yellow line). The 
lowest recorded summer ice areas have all 
occurred in the past 9 years. Air temperature, 
atmospheric pressure and wind patterns all 
affect the size of the Arctic ice sheet. The NSIDC 
cautions that further melting could still occur. 


has committed US$6.77 billion 
to Orion. The decision follows 
a successful uncrewed flight 

of the spacecraft in December 
2014. NASA also committed to 
atest flight with astronauts no 
later than April 2023. 


Car emissions bust 
The US Environmental 
Protection Agency (EPA) 
began investigating German 
car maker Volkswagen on 

18 September for designing 
vehicles that can bypass 

US emissions standards. 
Volkswagen officials admitted 
installing software in some 
models that switches on all the 
emissions controls only when it 
detects that a car is undergoing 
an emissions test. During 
normal driving conditions, the 
vehicles emit up to 40 times 
more nitrogen oxides than 
allowed by law, according to 


NASA/GODDARD SCIENTIFIC VISUALIZATION STUDIO 
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SOURCE: R. SEGE ETAL. J. AM. MED. ASSOC. 314, 1175-1177 (2015) 


the EPA. The agency says that 
the software was installed on 
at least 482,000 diesel vehicles 
sold in the United States. 


LIGO is ‘go’ 


Advanced LIGO (Laser 
Interferometer Gravitational- 
Wave Observatory) officially 
began gathering data on 

18 September. LIGO’s twin 
instruments, in Louisiana and 
Washington state, each have 
two 4-kilometre arms, and 
represent a US$200-million 
overhaul of the initial LIGO, 
which attempted to detect 
gravitational waves in the 
2000s. These space-time ripples 
are one of the major predictions 
of Einstein’s general theory of 
relativity that have still to be 
observed directly. On the same 
day, Italy and France agreed 

to extend their collaboration 
on Virgo, LIGO’s European 
counterpart, for a further five 
years beyond 2020. 


Pp FUNDING 
Beaver genome bid 


Genomics researchers are 
hoping that sports fans at 
Oregon State University in 
Corvallis will fund an effort 
to sequence the genome of 
the college mascot: the North 
American beaver (Castor 
canadensis; pictured). The 
Beaver Genome Project 
launched on 16 September 


TREND WATCH | 


In New England, male researchers 
in basic biomedical science receive 
68% more lab start-up funding 
than do female applicants, shows a 
15 September study (R. Sege et al. 
J. Am. Med. Assoc. 314, 1175- 
1177; 2015). A team analysed 
grant applications in 2012-14 to 
the Medical Foundation Division 
of Health Resources in Action 
(HRA) in Boston, Massachusetts, 
to calculate institutional support 
given to 219 applicants of 2 HRA 
programmes. The disparity was 
not explained by experience, 
degree or host-institution wealth. 


and hopes to raise US$30,000 
by 30 October (go.nature. 
com/cjaizw). The funds will 
pay for genome sequencing of 
Filbert, a four-year-old beaver 
born and raised at Oregon 
Zoo in Portland. Researchers 
hope to gain insight into the 
rodent’s ability to digest wood 
and its complex dam-building 
behaviour. 


| __ERESEARCH 
CRISPR request 


The UK Human Fertilisation 
and Embryology Authority 
(HFEA), which regulates 
embryo research, is 
considering the country’s 
first application for work that 
would involve editing the 
genomes of human embryos. 
Genome editing is illegal 

for treatment in the United 
Kingdom, but is possible for 
research under licence from 
the HFEA. The application 
to use the CRISPR/Cas9 
technique comes from Kathy 
Niakan, a researcher at the 
Francis Crick Institute in 
London. Earlier this year, a 


Chinese team used the same 
method to edit the genomes of 
human embryos. 


Food and drug chief 


US President Barack Obama 
has nominated cardiologist 
Robert Califf to become the 
next head of the US Food and 
Drug Administration, the 
White House announced on 
15 September. Califf, a clinical- 
trial specialist, is currently 
taking a leave of absence from 
Duke University in Durham, 
North Carolina, while serving 
as the deputy commissioner of 
the agency. The US Senate must 
confirm his nomination before 
Califf can take the position. 


Vatican stargazer 


Pope Francis announced on 

18 September that US planetary 
scientist Guy Consolmagno has 
been appointed director of the 
Vatican Observatory. A Jesuit 
brother, science-fiction fan and 
co-author of the book Would 
You Baptize an Extraterrestrial?, 
Consolmagno studies 


GENDER GAP IN START-UP SUPPORT 


Male and female junior researchers in the New England region of the 
United States fare very differently in the start-up funding they get — 
which includes salary, research technicians, equipment and supplies. 
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SEVEN DAYS | THIS WEEK | 
COMINGUP 


27 SEPTEMBER - 

2 OCTOBER 
Europlanet, the 
European Planetary 
Science Congress, is held 
in Nantes, France. 
www.epse2015.eu 


28 SEPTEMBER 

The Indian Space 
Research Organisation's 
astronomy satellite, 
ASTROSAT, is set to 
launch from the Satish 
Dhawan Space Centre at 
Sriharikota. 
http://astrosat.iucaa.in 


28 SEPTEMBER - 

2 OCTOBER 

The future quantum 
world is discussed at 
the 5th international 
quantum cryptography 
conference in Tokyo. 
http://2015.qcrypt.net 


meteorites and asteroids and 
curates the Vatican's meteorite 
collection. He replaces José 
Funes, who had been director 
since 2006. The observatory 
is based at Castel Gandolfo, 
outside Rome, and runsa 
telescope in Tucson, Arizona. 


Australian science 


Christopher Pyne was 
appointed Australia’s Minister 
for Industry, Innovation and 
Science on 20 September. Prime 
Minister Malcolm Turnbull 
shuffled the cabinet within 

days of his 14 September coup 
over former prime minister 
Tony Abbott, whose tenure 

was marked by drastic cuts to 
research funding. Earlier this 
year, as education and training 
minister, Pyne was involved 
ina controversial decision to 
award funding to the University 
of Western Australia in Perth 
for a research centre linked to 
climate-policy sceptic Bjorn 
Lomborg. The university 
ultimately rejected the funding. 


> NATURE.COM 
For daily news updates see: 
www.nature.com/news 
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MIDDLE EAST 


aS 


Syrian refugees wait for a bus in Istanbul, Turkey: many young people are missing out on higher education as a result of conflict in the Middle East. 


Lost generation looms as 
refugees miss university 


Educational void risks hampering reconstruction in Middle East. 


BY DECLAN BUTLER 


uman-rights organizations are calling 
H: universities and governments 

worldwide to invest more in the 
education of the hundreds of thousands of 
student refugees who are fleeing war-torn 
regions of the Middle East. 

They warn that the countries in conflict 
risk losing a future generation of scientists, 
engineers, physicians, teachers and leaders — 
and that university-aged refugees who have 
found shelter elsewhere represent a crucial 


opportunity to reverse some of the lost intel- 
lectual capital. “Each scholar and student that 
we lose now deepens the challenge of restoring 
the region when the violence eventually sub- 
sides,’ says Robert Quinn, executive director of 
the Scholars at Risk Network, a human-rights 
group headquartered in New York City. 
Quinn also cautions that allowing an educa- 
tional void to develop in the Middle East could 
create a fertile recruiting environment for radi- 
cal militias and terrorists. “It is deeply in the 
interest of Europe and the West to protect and 
invest in the intellectual capital of the region,” he 


says. “The failure to invest massively is foolishly 
shortsighted.” 

Conflicts in Syria, Iraq and Yemen, as well 
as in Libya and other North African countries, 
have led to a record number of refugees. By 
the end of 2014, 60 million people worldwide 
were seeking refuge either in safer parts of their 
countries or abroad, according to the Office 
of the United Nations High Commissioner 
for Refugees. That is the highest number ever 
recorded, and almost double the 37.5 million 
displaced individuals a decade earlier. 

Syria, which had a population of nearly 
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> 21 million before the ongoing conflict there 
began four years ago, has produced the most 
refugees, with 7.6 million people displaced 
internally and a further 4 million forced to flee 
the country. Around 10% of those people are of 
university age, estimates James King, who isa 
senior researcher at the Scholar Rescue Fund, 
part of the Institute of International Educa- 
tion (IIE), a non-profit, educational-exchange 
organization in New York City. 

Yet the university system in Syria has all 
but collapsed, and few of the young people 
who have left the country are receiving higher 
education. Of those refugees who fled abroad, 
most have found temporary shelter in neigh- 
bouring countries — Turkey is hosting some 
1.8 million, Lebanon 1.2 million and Jordan 
630,000 — but only around 5% of the uni- 
versity-aged refugees in these countries are 
enrolled at local institutions, according to a 
March report funded by the European Com- 
mission (see go.nature.com/9ljpbl). 

Before the conflict began, 26% of young 
adults in Syria were receiving tertiary educa- 
tion. That leaves hundreds of thousands of 
people who would normally be attending uni- 
versity going without. 

Even when universities in the refugees’ host 
countries have capacity — and this in itself is 
an issue; Turkey, for example, is struggling to 
accommodate all of its own eligible and inter- 
ested students — there are a string of further 
impediments to enrolment. Many students 
have fled without documents, says King, 
including records of past academic creden- 
tials. Other issues are financial and material 
hardships, which can force young adults to 
work, leaving them no time for education. In 
Turkey, where just 1% of Syrian refugees aged 
18-24 have found university places, language 


difficulties are a big problem. 

Scholarships are available. The IIE-led Syria 
Consortium for Higher Education in Crisis, a 
network of higher-education institutions world- 
wide that was created in 2012, has provided 
US$4.5 million to support 333 Syrian students, 
including 158 scholarships to attend universities 
in Western countries. At least 20 similar initia- 
tives also offer scholarships to institutions across 
the globe. However, demand far outstrips sup- 
ply: these combined efforts have been able 
to provide only around 7,000 students with 


“Bducationis A of tertiary 
the orphan of Allan Goodman, 
all these crises. president and chief 
P eople are so executive of the ITE, 
concernedabout  yotes the sheer scale 
food, water, of the crisis. “No 
shelter and 


organization or coun- 
try is set up to deal 
with it? he says, “The 
only thing we can do is — one by one, family 
by family, scholar by scholar, student by student 
— try to help individuals” 

He also says that humanitarian efforts have 
tended to focus on saving lives and relieving 
misery among those fleeing conflict. “Edu- 
cation is the orphan of all these crises,” he 
says. “People are so concerned about food, 
water, shelter and other basics, and we haven't 
thought enough about education.” The 1.5% of 
global humanitarian aid that goes to education, 
meanwhile, is spent largely on primary and 
secondary schooling, not higher education, 
which traditionally has been seen as a luxury. 

There are signs that attitudes are chang- 
ing. In May, the European Union's trust fund 
for the Syrian crisis committed €12 million 
(US$14.5 million) to assist 20,000 Syrian 


other basics.” 


refugees in obtaining higher education through 
scholarships and other means. As the European 
Commission report notes, however, scholar- 
ships cannot meet the enormous need, which 
would amount to billions, not millions, of euros. 

It would be more cost effective to provide 
direct financial aid to universities in the coun- 
tries with the most Syrian refugees, the report 
states. And various organizations, including 
the UN children’s charity, UNICEF, are explor- 
ing whether the massive open online courses 
(MOOCs) now offered by some top universi- 
ties could also help. By using recorded lectures 
and social-networking-style communication, 
for example, MOOCs are intended to democ- 
ratize access to a world-class education. But 
they are largely untested in a refugee situation, 
Goodman says, and most students still want a 
diploma accredited by a ministry of education. 
“A MOOC from Stanford or MIT isn't the same,” 
he says. “The most durable situations are those 
that seek to integrate students into national uni- 
versity systems.” 

Clearly, a long-term solution will require 
enormous investment and much greater 
involvement by higher-education institutions 
worldwide, Quinn says. Next month, the 
IIE and other organizations will hold a two- 
day workshop in Istanbul, Turkey, aimed at 
better coordinating efforts and exploring fresh 
approaches to scaling up access. 

The challenge is great — not least because 
the conflicts seem set to get worse before they 
get better. “But this must be measured against 
the costs of not doing it,” says Quinn. “If we 
invest over the next five or ten years in educat- 
ing and strengthening as many Middle East 
citizens and children as possible, we will have 
planted the seeds of a transformed region and 
much brighter future for the world? = 


SUSTAINABLE AID 


UN sets out next 
development goals 


Scientists call for sharper focus in anti-poverty push. 


BY JEFF TOLLEFSON 


n 25 September, Pope Francis will 
() address the United Nations just 

before a three-day meeting that will 
set the agenda for international develop- 
ment efforts over the next 15 years. At the 
Sustainable Development Summit in New 
York, global leaders will adopt 17 goals that 
are meant to improve the lives of the world’s 


poorest people by 2030, without jeopardizing 
the health of the planet. 

Ambitious and broad, these Sustainable 
Development Goals (SDGs) would, if met, 
greatly improve human welfare. But some 
experts fear that the goals are too numerous 
and vague to have practical value. “I’m a little 
worried that there are too many of them,” 
says Steven Radelet, director of the Global 
Human Development Program at Georgetown 
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University in Washington DC. “They may fall 
prey to the old adage that if everything is a 
priority, then nothing isa priority” 

First on the list: “End poverty in all its forms 
everywhere”. Second is to end hunger and 
achieve food security while improving nutri- 
tion and promoting sustainable agriculture. 
The list goes on to address fundamental issues 
such as education, gender equality and access 
to water and basic sanitation services. It also 
calls for economic growth, environmental con- 
servation and clean energy for all people, while 
urging action to combat climate change. The 
goals are supplemented by 169 specific targets 
that are meant to clarify the work that needs 
to be done. 

Under discussion since 2012, the SDGs 
replace the expiring Millennium Develop- 
ment Goals, which the UN adopted in 2000. 
Those eight objectives called, among other 
things, for halving extreme poverty, reduc- 
ing mortality among children under five by 
two-thirds and instituting universal primary 


MICHAEL CHRISTOPHER BROWN/MAGNUM 
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The UN goal of ending poverty would help these street children in the Democratic Republic of the Congo. 


education, all by 2015. Although the world 
has made considerable advances in many of 
these areas, it has been debated how much 
impact the goals themselves have had. Much 
of the progress in alleviating poverty over the 
past two decades, for example, has come from 
rapid economic development in southeast 
Asia and China. 

Moving forward, the challenge for govern- 
ments will be to invest limited resources 
effectively and track progress towards the 
goals. The UN is discussing how to structure 
a progress-assessment system based on a list 
of measurable indicators that it is develop- 
ing for the targets. As it stands, the list is too 
long for governments seeking to measure 
their progress, says Mark Stafford Smith, 
a researcher at Australias Commonwealth 
Scientific and Industrial Research Organisa- 
tion in Canberra. 

It will be up to the scientific community 
to identify simpler indicators and policies 
that will promote progress, says Stafford 
Smith, who chairs the scientific committee 
of Future Earth, an international clearing 
house for sustainability research. Replacing 
firewood with a more sustainable fuel source, 
for instance, would boost air quality and 
therefore improve human health (goal 3), 
while reducing the impact on local ecosys- 
tems (goal 15). And by reducing time spent 
foraging for fuel, it would free children to go 
to school (goal 4) and empower women to 


contribute to economic growth by earning 
money (goals 5 and 8). 

Researchers also want to find ways to pre- 
vent conflicts between goals. For example, 
without advances in efficiency and a shift 
towards renewable energy, the expansion of 

access to modern 


“People have energy sources (goal 
piledeverything 7) would interfere 
in there, but with the goal of keep- 
the research ing oe ea 
communi in check (goal 13). 
can pd a “People have piled 
much smaller everything in there, 
set of integrated but the research com- 
oals.” 8 munity can focus on 
8 ° a much smaller set of 


integrated goals,” says 
Stafford Smith. “If we don’t do that, then we 
will find that these potential conflicts become 
real ones.” 

Through a project called The World in 2050, 
researchers who have used computer models 
to explore the socio-economic implications of 
climate change are leading an analysis to iden- 
tify policy scenarios that can assure that the 
goals are met over the next few decades. 

One of the institutions leading the effort 
is the International Institute for Applied 
Systems Analysis in Laxenburg, Austria. Its 
deputy director-general, Nebojsa Nakicenovic, 
says: “The idea is to understand how one can 
achieve all of those goals together.” m 
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Brain stimulation in children 
spurs hope — and concern 


Treatment of developing brains offers greater scope for improvement but also intensifies risks. 


Studies aimed at enhancing learning in children are generating controversy. 


BY LINDA GEDDES 


with dyslexia and the mathematical equiva- 
lent, dyscalculia, as well as the movement 
disorder dyspraxia, Jack (not his real name) 
often misbehaved and played the class clown. 
So the boy’s parents were relieved when he was 
offered a place at Fairley House in London, 
which specializes in helping children with learn- 
ing difficulties. Fairley is also possibly the first 
school in the world to have offered pupils the 
chance to undergo electrical brain stimulation. 
The stimulation was done as part of an exper- 
iment in which twelve eight- to ten-year-olds, 


> 


Je struggled in regular school. Diagnosed 


TOP STORY 


including Jack, wore an electrode-equipped cap 
while they played a video game. Neuroscientist 
Roi Cohen Kadosh of the University of Oxford, 
UK, who led the pilot study in 2013, is one ofa 
handful of researchers across the world who are 
investigating whether small, specific areas of a 
child’s brain can be safely stimulated to over- 
come learning difficulties. “It would be great to 
be able to understand how to deliver effective 
doses of brain stimulation to kids’ brains, so that 
we can get ahead of developmental conditions 
before they really start to hold children back in 
their learning,” says psychologist Nick Davis of 
Swansea University, UK. 

The idea of using magnets or electric currents 


to treat psychiatric or learning disorders — or 
just to enhance cognition — has generated 
a flurry of excitement over the past ten years. 
The technique is thought to work by activating 
neural circuits or by making it easier for neu- 
rons to fire. The research is still in its infancy, 
but at least 10,000 adults have undergone such 
stimulation, and it seems to be safe — at least in 
the short term. One version of the technology, 
called transcranial magnetic stimulation (TMS), 
has been approved by the US Food and Drug 
Administration to treat migraine and depres- 
sion in adults. 

Interest is growing, however, in whether such 
technologies might have even greater benefits 
in children. Particularly promising is TMS’s 
cheaper and more-portable cousin, transcranial 
direct-current stimulation (TDCS). 

Researchers say that the stimulation effects 
are likely to penetrate deeper in children 
because their skulls are thinner than adults, and 
might have more of an impact in brains that are 
still growing. However, the same factors that 
intensify the potential benefits are also cause for 
concern. “It’s like when you builda house: if you 
think things are going wrong, it’s much easier 
to fix things at the beginning rather than later 
on, but it’s also much easier to ruin them,” says 
Cohen Kadosh. “We don't know how electrical 
stimulation interacts with the developing brain?” 

Cohen Kadosh also worries about abuse of 
the technology. Although devices prescribed 
for medical treatments must meet certain safety 
standards, there are currently no laws in either 
Europe or the United States to regulate the use 
of TDCS in people merely hoping to enhance 
their cognition, and companies now sell the 
TDCS headsets online. So parents, say, might 
feel tempted to try to boost the cognitive abili- 
ties of their children outside the controlled con- 
ditions of a lab. After weighing up the pros and 
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cons, however, Cohen Kadosh decided to 
approach Fairley House about doing a trial. 
He also had to seek ethics approval, which 
he received. “We were very worried about 
putting brain stimulation into place, because 
as a school we knew nothing about it, but we 
were reassured about the ethics and safety,” 
says Jenny Lim, an occupational therapist 
who works with children at the school. 


LEARNING ENHANCER 

The study follows on from one in which 
Cohen Kadosh showed that a variant of 
TDCS called transcranial random-noise 
stimulation (TRNS) could boost mathe- 
matical ability in adults (A. Snowball et al. 
Curr. Biol. 23, 987-992; 2013). 

In the Fairley House study, his team gave 
the 12 children with mathematical learning 
difficulties nine 20-minute training ses- 
sions. Half of the volunteers received TRNS, 
targeted at the brain area responsible for 
processes such as planning and abstract 
reasoning; the other half wore a TRNS cap 
but did not receive any stimulation. TRNS is 
thought to work by modulating brain signals 
during learning: in this case, the children 
moved their bodies from side to side to guide 
a ball ona screen to land at a certain point on 
anumber line, with the difficulty increasing 
as they progressed. 

The children who received stimulation 
showed greater progress in performance 
than did the controls — reaching level 17 
on average, compared with level 14 — as 
well as significant improvements in general 
mathematics test scores. Cohen Kadosh pre- 
sented the analysis at the British Association 
for Psychopharmacology meeting in Bristol 
in late July and has submitted the results for 
publication. He now plans to further this line 
of research. 

But neuroscientist Vincent Walsh at 
University College London's Institute of 
Cognitive Neuroscience thinks that studies 
of brain stimulation in children are prema- 
ture. The benefits observed in young adults 
are not always seen in older people, he says, 
and many electrical-stimulation results 
have yet to be replicated. “There is simply 
no sound scientific basis for extending such 
poor work to children,” he says. 

Davis, by contrast, thinks that such 
experiments are justified, but is concerned 
about the trend to use the techniques out- 
side formal studies. He estimates that at 
least 1,000 children around the world 
have received some kind of brain stimula- 
tion as part of clinical studies, and expects 
more in future. He stresses the importance 
of publishing the results of any work done 
in children. “I would urge all scientists to 
share their results when children and young 
people are given brain stimulation, to allow 
other scientists to learn from ‘failed’ trials 
and to adapt the protocols if needed.” m 
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Science vies for notice 
in Canadian election 


Current government has cut funding and limited 
researchers’ influence over policy, critics say. 


BY NICOLA JONES 


anadians will head to the polls on 
( 19 October in a federal election that 

many scientists hope will mark a turn- 
ing point after years of declining research 
budgets and allegations of government 
censorship. 

Prime Minister Stephen Harper, who has 
been in office since 2006, now finds his right- 
leaning Conservative party ina tight three-way 
race with the left-leaning New Democratic 
Party (NDP) and the middle-left Liberals. 
Although science has not emerged as a top 
issue during the campaign, researchers are 
fighting to make their concerns heard. 

In an unprecedented move, the Professional 
Institute of the Public Service of Canada — a 
union in Ottawa that represents more than 
57,000 government scientists and other pro- 
fessionals — is campaigning in the federal 
race. “Here’s how we do things in the Harper 
government,’ declares one of the union's radio 
advertisements. “We muzzle scientists, we cut 
research and we ignore anyone who doesn't tell 
us what we want to hear.” 

The group estimates that the Harper 
administration has eliminated jobs for some 
2,500 scientists. And the government’s own 
data show that Canada’s ranking for research 
and development spending dropped from 
16th among 41 comparable nations in 2006 to 
23rd in 2011 (the most recent year for which 
government figures are available). Harper has 
also been accused of limiting government sci- 
entists’ ability to communicate with the press 
and public; Canada’s information commis- 
sioner promised to investigate this in 2013, 
but has not yet released any findings. 

“The Harper government has had complete 
disdain for federal government science,” says 
Peter Wells, a marine biologist at Dalhousie 
University in Halifax, Nova Scotia. 

Kai Chan, an ecologist at the University 
of British Columbia in Vancouver, is similarly 
glum. “Thave been continually surprised by how 
bad it has gotten in Canada,” says Chan, who 
co-founded a group called ‘scienceinpolicy in 
the United States during George Bush's presi- 
dency. “It’s worse than I could have imagined, 
having closely scrutinized what I thought was 
the worst in North America” 


Although the Conservative party has done 
little to address such criticisms, members of 
the two opposition parties have called for 
a stronger role for science in government. 
Dozens of NDP and Liberal candidates for 
parliament have declared their support for 
evidence-based decision-making by signing 
a ‘science pledge’ developed by Evidence for 
Democracy, a non-profit science-advocacy 
group in Ottawa. 

These issues were not discussed at a debate 
in Calgary on 17 September that pitted Harper 
against NDP leader Thomas Mulcair and 
Liberal leader Justin Trudeau, and focused 
on the state of Canada’s economy. But con- 
cerns about the condition of Canadian science 

have nevertheless 


“The Harper influenced party 
governmenthas platforms. 

had complete The Liberal Party 
disdain has made scien- 
for federal tific integrity part 
government of its election cam- 
science.” paign, proposing the 


creation of a central 
public portal for disseminating government- 
funded research. The party is seeking to 
appoint a chief science officer to ensure the 
free flow of information. 

By contrast, Harper’s government phased 
out the position of national science adviser 
in 2007-08, replacing it with the Science, 
Technology and Innovation Council (STIC), 
a body that has drawn criticism from 
scientists for operating behind closed doors. 
STIC reviews issues at the request of the fed- 
eral government, and many of its reports 
are confidential. “It doesn’t even have many 
scientists on it,” says Graham Bell, a biolo- 
gist at McGill University in Montreal and 
president of the Royal Society of Canada, 
who would like to see the science adviser’s post 
re-established. 

Similarly, the NDP has called for a parlia- 
mentary science officer, a position that would 
be independent of the majority party or a coali- 
tion leading the government. 

Katie Gibbs, executive director of Evidence 
for Democracy, says that Canada could benefit 
from either a science adviser or a parliamen- 
tary science officer. “They're different visions,” 
she says. “You could easily have both.” m 
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IN FOCUS 


Scripps fills 
top posts 


Duo to focus on finances 
after a failed merger. 


BY ERIKA CHECK HAYDEN 


geneticist and a chemist will co-lead 
At Scripps Research Institute in 

La Jolla, California, the biomedical 
research organization announced on 18 Sep- 
tember. The institute said that Scripps chemist 
Peter Schultz will take over as chief executive 
and vice-chair, while molecular biologist Steve 
Kay, currently at the University of Southern 
California (USC) in Los Angeles, will assume 
the institute’s presidency. 

The appointments are the latest in a series 
of leadership changes at the research institute. 
Over the past decade, public funding from the 
US National Institutes of Health has flattened, 
competition for philanthropic donations has 
intensified and pharmaceutical companies 
have shifted away from providing unrestricted 
funds for basic research. This has meant that 
Scripps and other independent research organ- 
izations have struggled to stay afloat. 

“The broader significance is this worldwide 
need to change the model for drug discovery, 
and to show that an effective bench-to-bedside 
model can be created within the not-for-profit 
sector,” says Kay, who is dean of the Dornsife 
College of Letters, Arts and Sciences at USC. 

Kay and Schultz's appointments come just 
over a year after the failure of a bid by USC 
to merge with Scripps. In July 2014, Scripps 
president and chief executive Michael Marletta 
departed the institute after a faculty revolt 
against the merger deal. The following month, 
James Paulson, head of the institute’s cell- and 
molecular-biology department, was named 
acting president and chief executive. When 
the merger fell through, Scripps was said to be 
running an operating deficit of US$21 million. 

“Looking forward, I think many scientists 
realize that NIH funding is a good thing if you 
have it, but it’s not sustainable,” says organic 
chemist Phil Baran, who was on the search 
committee that selected Kay and Schultz. “What 
is stable are endowments, which you build by 
having products that give you proceeds, and by 
philanthropy. You get philanthropy by doing the 
best science, so that’s why there is such frenzied 
competition for the brightest minds.” 

Kay, who was previously at Scripps from 
1996 to 2007, hopes that his experience with 
the institute will ease any tensions about his 
arrival from USC. = 
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Multiple devices will allow ASTROSAT to access more wavelengths than most other telescopes. 


ASTRONOMY 


Indian telescope set 
for global stardom 


ASTROSAT will extend the capabilities of existing US and 
European observatories, and boost Indian research. 


BY T. V. PADMA 


satellite is about to bring India 
A ternational acclaim, at least in 
astronomy circles. On 28 September, 
ASTROSAT, the country’s first space observa- 
tory dedicated to science, will take to the skies. 

As well as boosting the activities of Indian 
astronomers — who are abuzz with excite- 
ment — the satellite is expected to benefit 
researchers all over the world. Designed to 
orbit Earth for five years, it has capabilities 
not offered by existing space telescopes. 

“Tt is a notable and fantastic step forward 
for Indian astronomy, and has broad impli- 
cations for astronomers everywhere,’ says 
Henry Yang, a mechanical engineer at the 
University of California, Santa Barbara, who 
is chair of the board for the Thirty Meter 
Telescope (TMT) project, an observatory 
planned for Mauna Kea, Hawaii. 

India has had ground-based telescopes 
for decades, including the Giant Metrewave 
Radio Telescope near Pune and the Indian 
Astronomical Observatory in the Himalayan 
cold desert of Ladakh. But although these can 
detect radio waves and infrared radiation, 
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which easily penetrate Earth’s atmosphere, 
they cannot monitor higher frequencies 
that the atmosphere tends to block — most 
ultraviolet light, for example, and all X-rays 
and y-rays. Without a space telescope of their 
own, Indian scientists have had to rely on ones 
operated by NASA and the European Space 
Agency (ESA) to study such radiation bands, 
which carry information about exotic neutron 
stars, newly born or exploding stars and the 
spiralling hot gases around black holes. 

“Often, as we do not know the exact specifics 
of the telescope design, we are not able to tune 
our research proposals accordingly,” says 
Varun Bhalerao, an astrophysicist at India’s 
Inter-University Centre for Astronomy and 
Astrophysics (IUCAA) in Pune. 

Indian astronomers have long been at a 
disadvantage for X-ray and ultraviolet stud- 
ies, says Somak Raychaudhury, who is the 
director of the IUCAA and has been involved 
with ASTROSAT since its inception. Orbiting 
650 kilometres above Earth, ASTROSAT will 
collect data on this portion of the light spec- 
trum, giving Indian scientists faster — and 
guaranteed — access to the information. They 
will also have privileged access. “Everybody, 


ISRO 


senior or junior scientists, is talking about stud- 
ies they can now propose,’ adds Bhalerao, who 
is excited about studying neutron stars from 
India without having to wait for international 
support. Bhalerao has been studying these stel- 
lar objects using high-energy X-ray wavelengths 
with NASA’s Nuclear Spectroscopic Array 
(NuSTAR) at the California Institute of Tech- 
nology in Pasadena, and is looking forward to 
extending that study to the lower-energy X-ray 
and ultraviolet bands that will be available 
through ASTROSAT. 

With five instruments, or ‘payloads; tuned 
to detect different types of light, ASTROSAT 
will observe a wider variety of wavelengths 
than most other satellites, from visible light 
to the ultraviolet and X-ray bands. Mylswamy 
Annadurai, director of the Indian Space 
Research Organisation's Satellite Centre in 
Bangalore, calls this “the strength and unique- 
ness of ASTROSAT™. Black holes, galaxy clus- 
ters and other celestial objects can blaze with 
different wavelengths as different events occur. 
“When all payloads are combined, ASTROSAT 
gives a coverage which no other observatory has 
achieved till now,’ he says. 

For some researchers, the satellite's X-ray 
detection capability will fill the gap left when 
NASAs Rossi X-ray Timing Explorer satellite 
died in 2012, after 16 years of operations. Like 
Rossi, ASTROSAT will look regularly at large 


areas of the sky, enabling it to track simul- 
taneously a large number of X-ray sources 
that change with time, says Randall Smith, 
an astronomer at the Harvard-Smithsonian 
Center for Astrophysics in Cambridge, Mas- 
sachusetts. By contrast, the X-ray telescopes 
currently in space generally focus on studying 
individual objects in great detail. 
ASTROSAT’s X-ray detectors can also 
cope with very bright objects that would satu- 
rate those on other satellites such as NASA's 
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Chandra X-ray Observatory or ESA’s X-ray 
Multi-Mirror (XXM-Newton) mission. 
According to Andrew Fabian at the Univer- 
sity of Cambridge's Institute of Astronomy in 
the United Kingdom, this capability will make 
ASTROSAT “invaluable” for alerting the inter- 
national community to short-lived bursts of 
X-rays — a key indicator that something new 
is happening in space. m 


Additional reporting by Alexandra Witze. 


CORRECTIONS 

The Editorial ‘Too close for comfort?’ (Nature 
525, 289; 2015) incorrectly stated: “In 

his defence, Folta argued that the money 
supported only travel and outreach, not 
research, and he was therefore under no 
obligation to disclose it”. Folta did not 

say this. He said that he had complied 

with his university’s disclosure rules. 

The News Feature ‘Why interdisciplinary 
research matters’ (Nature 525, 305; 2015) 
incorrectly affiliated Rebekah Brown with 
Monash University’s Water for Liveability 
centre — she is director of the Monash 
Sustainability Institute. The News story 
‘Africa braced for snakebite crisis’ (Nature 
525, 299; 2015) wrongly described snakes 


as ‘poisonous’ instead of ‘venomous’. 
And the News Feature ‘Team science’ 
(Nature 525, 308-311; 2015) gave the 
wrong authors for the report Evaluating 
Interdisciplinary Research. lt was written by 
Veronica Strang and Tom McLeish. 


CLARIFICATION 

The Editorial ‘Protection priority’ (Nature 
525, 290; 2015) made reference to the fact 
that the mice in the experiments showed 
no visible sign of distress. That statement 
referred only to the animals for which the 
data were not withdrawn. The committee 
did not comment on whether or not the 
animals in the withdrawn experiments 
showed distress. 
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THE BIG PEEK 


The data contained in tax returns, 
health and welfare records could be a 
gold mine for scientists — but only if 
they can protect people’s privacy. 


BY ERIKA CHECK HAYDEN 
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n 2011, six US economists tackled a question at 
the heart of education policy: how much does 
great teaching help children in the long run? 
They started with the records of more than 
11,500 Tennessee schoolchildren who, as part 
of an experiment in the 1980s, had been ran- 
domly assigned to high- and average-quality 
teachers between the ages of five and eight. 
Then they gauged the children’s earnings as 
adults from federal tax returns filed in the 
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2000s. The analysis’ showed that the benefits 
of a good early education last for decades: each 
year of better teaching in childhood boosted 
an individual’s annual earnings by some 3.5% 
on average. Other data showed the same indi- 
viduals besting their peers on measures such 
as university attendance, retirement savings, 
marriage rates and home ownership. 

The economists’ work was widely hailed 
in education-policy circles, and US President 
Barack Obama cited it 
in his 2012 State of the 
Union address when he 
called for more invest- 
ment in teacher training. 

But for many social 
scientists, the most 
impressive thing was that 
the authors had been able 
to examine US federal tax 
returns: a closely guarded data set that was then 
available to researchers only with tight restric- 
tions. This has made the study an emblem for 
both the challenges and the enormous potential 
power of ‘administrative data — information 
collected during routine provision of services, 
including tax returns, records of welfare ben- 
efits, data on visits to doctors and hospitals, 
and criminal records. Unlike Internet searches, 
social-media posts and the rest of the digital 
trails that people establish in their daily lives, 
administrative data cover entire populations 
with minimal self-selection effects: in the 
US census, for example, everyone sampled is 
required by law to respond and tell the truth. 

This puts administrative data sets at the 
frontier of social science, says John Friedman, 
an economist at Brown University in Provi- 
dence, Rhode Island, and one of the lead 
authors of the education study’. “They allow 
researchers to not just get at old questions in 
a new way, he says, “but to come at problems 
that were completely impossible before.” 


PROBING THE POPULATION 
In the past few years, administrative data 
have been used to investigate issues ranging 
from the side effects of vaccines’ to the lasting 
impact ofa child’s neighbourhood on his or her 
ability to earn and prosper as an adult’. Propo- 
nents say that these rich information sources 
could greatly improve how governments meas- 
ure the effectiveness of social programmes 
such as providing stipends to help families 
move to more resource-rich neighbourhoods. 
But there is also concern that the rush 
to use these data could pose new threats to 
citizens’ privacy. “The types of protections 
that we're used to thinking about have been 
based on the twin pillars of anonymity and 
informed consent, and neither of those hold 
in this new world,” says Julia Lane, an econ- 
omist at New York University. In 2013, for 
instance, researchers showed that they could 
uncover the identities of supposedly anony- 
mous participants in a genetic study simply 


by cross-referencing their data with publicly 
available genealogical information (see Nature 
497, 172-174; 2013). 

Many people are looking for ways to address 
these concerns without inhibiting research. 
Suggested solutions include policy measures, 
such as an international code of conduct for 
data privacy, and technical methods that allow 
the use of the data while protecting privacy. 
Crucially, notes Lane, although preserving 


“IT SHOULD BE HARD TO GET ACCESS TO 
DATA, BUT IT’S VERY IMPORTANT THAT 
SUCH ACCESS BE MADE POSSIBLE.” 


privacy sometimes complicates researchers’ 
lives, it is necessary to uphold the public trust 
that makes the work possible. 

“Difficulty in access is a feature, not a bug,” 
she says. “It should be hard to get access to 
data, but it’s very important that such access 
be made possible.” 

Many nations collect administrative data 
ona massive scale, but only a few, notably in 
northern Europe, have so far made it easy for 
researchers to use those data. 

In Denmark, for instance, every newborn 
child is assigned a unique identification num- 
ber that tracks his or her lifelong interactions 
with the country’s free health-care system and 
almost every other government service. In 
2002, researchers used data gathered through 
this identification system to retrospectively 
analyse the vaccination and health status of 
almost every child born in the country from 
1991 to 1998 — 537,000 in all. At the time, 
it was the largest study ever to disprove’ the 
now-debunked link between measles vaccina- 
tion and autism. 

Other countries have begun to catch up. In 
2012, for instance, Britain launched the unified 
UK Data Service to facilitate research access 
to data from the country’s census and other 
surveys. A year later, the service added a new 
Administrative Data Research Network, which 
has centres in England, Scotland, Northern 
Ireland and Wales to provide secure environ- 
ments for researchers to access anonymized 
administrative data. 

In the United States, the Census Bureau has 
been expanding its network of Research Data 
Centers, which currently includes 19 sites 
around the country at which researchers with 
the appropriate permissions can access confi- 
dential data from the bureau itself, as well as 
from other agencies. “We're trying to explore 
all the available ways that we can expand access 
to these rich data sets,” says Ron Jarmin, the 
bureau’s assistant director for research and 
methodology. 

In January, a group of federal agencies, 
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foundations and universities created the Insti- 
tute for Research on Innovation and Science 
at the University of Michigan in Ann Arbor 
to combine university and government data 
and measure the impact of research spending 
on economic outcomes. And in July, the US 
House of Representatives passed a bipartisan 
bill to study whether the federal government 
should provide a central clearing house of sta- 
tistical administrative data. 

Yet vast swathes of 
administrative data are 
still inaccessible, says 
George Alter, director 
of the Inter-university 
Consortium for Politi- 
cal and Social Research 
based at the University of 
Michigan, which serves 
as a data repository for 
approximately 760 institutions. “Health sys- 
tems, social-welfare systems, financial trans- 
actions, business records — those things are 
just not available in most cases because of pri- 
vacy concerns,” says Alter. “This is a big drag 
on research.” 


UNSOUGHT INTIMACY 

Feeding those concerns is the rising public 
unease about online privacy in general. 
Private companies known as data brokers 
operate on a vast scale, collecting and selling 
information about Internet searches, online 
purchases and other data streams that can 
be combined to draw surprisingly intimate 
conclusions. In one famous example, the US 
retailer Target inferred that a teenage girl was 
pregnant based on her purchases there, and 
it began sending her coupons for baby prod- 
ucts; her father was alerted to his impending 
grandchild only when the coupons arrived 
at the family’s home. In a 2014 study’ of data 
brokers, the US Federal Trade Commission 
pointed out the many ways in which this kind 
of information could harm consumers. People 
who buy products such as blood-sugar moni- 
tors, for instance, might be placed into a ‘dia- 
betes risk’ marketing category that could be 
used by an insurance company to pinpoint a 
potential customer as high risk. 

Many researchers argue, however, that 
there are legitimate scientific uses for such 
data (see Nature 488, 448-450; 2012). Jarmin 
says that the Census Bureau is exploring the 
use of data from credit-card companies to 
monitor economic activity. And researchers 
funded by the US National Science Founda- 
tion are studying how to use public Twitter 
posts to keep track of trends in phenomena 
such as unemployment. 

But not everyone makes the distinction 
between commerce and academia, says Lane. 
“People conflate the concern about big data 
being used for private-sector purposes to make 
money with big data being used for research.” 
In March 2014, for instance, while aiming to 
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significantly boost consumer privacy through 
a new data-protection regulation, the Euro- 
pean Parliament proposed limiting the use 
of personal health data for research without 
specific consent, which would have severely 
curtailed researchers’ access to those data. 
After objections from organizations such as 
the London-based biomedical-research char- 
ity the Wellcome Trust, the 
proposal looks likely to be 
jettisoned, but its fate will 
not become clear until 
2016, when the final text of 
the regulation comes up for 
approval. 

One solution to the 
privacy concerns has been 
to keep data under lock and 
key, tightly restricting who 
can access it. At the US research data centres, 
for instance, investigators are not allowed 
to take smartphones or flash drives into the 
rooms where they will use the centre’s com- 
puter terminals. The computers themselves 
contain no data, but only link remotely to 
secure servers. 


TECHNICAL ANSWERS 

Computer scientists and cryptographers are 
experimenting with technological solutions. 
One, called differential privacy, adds a small 
amount of distortion to a data set, so that 
querying the data gives a roughly accurate 
result without revealing the identity of the 
individuals involved. The US Census Bureau 
uses this approach for its OnTheMap pro- 
ject, which tracks workers’ daily commutes. 
Researchers at the bureau use actual data to 
build a statistical model based on where indi- 
vidual workers commute each day. They then 
build a synthetic data set that fits the model, 
but does not contain the actual data. This syn- 
thetic data set is released to the public, allow- 
ing users to draw accurate conclusions about 
transport and economic trends without track- 
ing the exact movements of real individuals. 
Researchers are still learning to trust synthetic 
data, however, so few papers that have been 
published on this subject go beyond demon- 
strating the methods. 

In any case, although synthetic data poten- 
tially solve the privacy problem, there are some 
research applications that cannot tolerate any 
noise in the data. A good example is the work 
showing the effect of neighbourhood on earn- 
ing potential*, which was carried out by Raj 
Chetty, an economist at Harvard University in 
Cambridge, Massachusetts. Chetty needed to 
track specific individuals to show that the areas 
in which children live their early lives correlate 
with their ability to earn more or less than their 
parents. In subsequent studies”, Chetty and his 
colleagues showed that moving children from 
resource-poor to resource-rich neighbour- 
hoods can boost their earnings in adulthood, 
proving a causal link. 


Secure multiparty computation is a tech- 
nique that attempts to address this issue by 
allowing multiple data holders to analyse 
parts of the total data set, without revealing the 
underlying data to each other. Only the results 
of the analyses are shared. 

For instance, in 2010, the US Defense 
Advanced Research Projects Agency (DARPA) 


“THE LESSON IS NOT TO 
UNDERESTIMATE PUBLIC CONCERNS. 
PUBLIC TRUST IS VERY FRAGILE.” 


asked a team of cryptographers to develop a 
secure multiparty computation protocol to 
analyse the paths of commercial satellites and 
head off costly collisions. Currently, companies 
do this by sharing their orbit data, which they 
consider proprietary, to a trusted third party 
that performs the analysis. But DARPA con- 
cluded that secure multiparty computation 
could be used to predict possible collisions just 
as effectively, albeit a little more slowly. 

In 2015, the Estonian company Cybernetica, 
based in Tallinn, said that it had used similar 
techniques to analyse financial filings of com- 
panies to detect tax fraud. It is also jointly 
analysing records from the country’s tax and 
education ministries to explore whether uni- 
versity students who hold jobs fail their courses 
more often than those who focus exclusively 
on their studies. 

There are still some problems in need of 
technical solutions — especially as govern- 
ment agencies look beyond their own walls. 
For instance, the Census Bureau wants to 
combine its internal data on the formation and 
activities of companies with public data on pat- 
ents to examine the factors that drive corporate 
innovation. But it could be relatively easy to 
unmask the identities of companies included 
in the analysis by matching them to informa- 
tion in the public patent database. Jarmin’s 
team has not yet worked out an approach that 
adequately protects privacy. 

But for the most part, technical solutions 
are now being put in place. Increasingly, what 
looks likely to hold up the research is a lack of 
clear ethical and legal guidance about how data 
on individuals can be used — for all purposes, 
including research. 

Pam Dixon, executive director of the 
World Privacy Forum in San Diego, Cali- 
fornia, points to programmes such as India’s 
national identification-card system, launched 
in 2010. This effort provided more than 900 
million people with biometric identity cards 
that were linked to photographs, fingerprints 
and iris scans. The cards were supposed to be 
voluntary, and were used to identify rightful 
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recipients of social benefits such as fuel and 
unemployment aid. 

But the country did not create a legislative 
framework to govern the use of the cards. They 
were soon discussed as gateways for a variety of 
essential services, such as salary payment and 
marriage registrations. This violated the origi- 
nal spirit of the programme, critics contended, 
because data from the cards 
was not supposed to be 
coerced from individuals. 
The Indian Supreme Court 
ruled such uses of the sys- 
tem illegal on 11 August, 

ut the country’s Parlia- 
ment has still not enacted a 
governing framework. 

Likewise, in 2013, the 
United Kingdom launched 
the care.data programme to link records from 
patients’ visits to general practitioners with 
their records from other parts of the health-care 
system, but there was no clear guidance on how 
the project’s data were to be used. After it was 
revealed that the database designed to distrib- 
ute patient data had inappropriately released 
some information to private entities — such as 
actuaries, which aid insurers in setting insur- 
ance rates — care.data came under fire. On 2 
September, the National Health Service (NHS) 
said that the government will conduct a review 
of the security of NHS data and develop new 
opt-out and consent provisions. The system is 
intended to be available to all patients by 2016. 

In the meantime, says Nicola Perrin, 
head of policy at the Wellcome Trust, the 
fallout has created huge delays in existing 
research projects, including clinical trials 
and health evaluation, audit and service 
research. Researchers in charge of SABRE, 
a large cohort study examining how diabetes 
and heart disease affect people of different 
ethnicities, have not received patient updates 
since March 2014; as a result, they risk send- 
ing requests for information to families 
whose loved ones may have died. The epi- 
sode serves, for Perrin, as a cautionary tale 
about how the power of data could backfire if 
social unease with its uses is not addressed as 
soon as possible. “The lesson is to not under- 
estimate public concerns,” she says. “Public 
trust is very fragile — it’s difficult to build 
and easy to break.” m 


Erika Check Hayden is a reporter for Nature 
in San Francisco, California. 


1. Chetty, R. et al. Q. J. Econ. 126, 1593-1660 (2011). 
2. Madsen, K. M. et al. N. Engl. J. Med. 347, 1477-1482 
(2002). 

3. Chetty, R., Hendren, N., Kline, P. & Saez, E. Q. J. Econ. 
129, 1553-1623 (2014). 

A. Ramirez, E., Brill, J., Ohlhausen, M. K., Wright, J. D. & 
cSweeny, T. Data Brokers: A Call for Transparency 
and Accountability (Federal Trade Commission, 
014). 

hetty, R., Hendren, N. & Katz, L. F. NBER Working 
aper No. 21156 (2015); available at http://www. 
ber.org/papers/w21156 


on 
3S UVOQN 


, French scientists wanted to see 
ed to a mouse brain when they 
h the creature’s mitochondria, 
that generate energy inside most 
s. The team looked at two mouse 
alled H and N, that carry slightly dif- 
erent mitochondrial-DNA sequences. 

It was clear that the H mice learned to navigate 
mazes faster than their N cousins, but when the 
team swapped the mitochondria — creating 
H mice with N mitochondria and N mice with H 
mitochondria — their performance changed. 
Mitochondria from N seemed to slow down the 
learning process for H mice. N mice, meanwhile, 
improved slightly with H mitochondria’. And 
the team, led by geneticist Pierre Roubertoux at INSERM, the French 
National Institute for Health and Medical Research in Marseilles, found 
other changes in behaviour, and in brain anatomy, too. 

The results came as a surprise, because such differences between 
mitochondrial genomes were seen as neutral — having no biological 
effect. “The long-held view was that the genetic variation we find within 
the mitochondrial genome doesnt affect function; says Damian Dowling, 
an evolutionary biologist at Monash University in Melbourne, Australia. 

That view has been changing. A growing body of evidence suggests 
that mitochondria do not just produce energy, but also influence a wide 
range of cellular processes, from cell death to immune responses, and 
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mystery 


The ‘powerhouses’ 
of the cell may have 
more roles than 
expected. Could that 
generate problems 
for mitochondrial 
replacement 
therapies? 


By Garry Hamilton 


ria 


that variations in the organelle matter very 
much. Variants in mitochondrial DNA are now 
linked to many common human conditions, 
including neurodegenerative diseases, cancer 
and ageing. 

The effects of these variants may come 
about through the organelle’s long-evolved 
partnership with the much-larger nuclear 
genome. Studies in a handful of organisms 
have shown that just as for H and N mice, 
swapping healthy mitochondria between 
closely related strains can cause a mismatch 
between the genomes and can change impor- 
tant traits. The evidence, say Dowling and 
others, should raise questions about the safety 
of a procedure that will soon be used in humans. 

In February, the UK government approved mitochondrial replace- 
ment therapy, a technique that would allow a woman with a mitochon- 
drial disorder to give birth to healthy children by pairing her nuclear 
DNA with the healthy mitochondria from a donor’s egg. The approval 
came after a 3.5-year effort to review the safety and ethics of creating 
individuals with DNA from three people (what some refer to as three- 
parent babies). And although many scientists lauded the decision, some 
worry that it is premature. “They're not looking at the bigger picture,’ 
says Ted Morrow, an evolutionary biologist at the University of Sussex 
in Brighton, UK, who is arguing for more-rigorous safety testing. “The 
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standards for a shampoo seem to be harsher.” 

A common refrain in favour of the therapy is that the genetic 
contribution from mitochondria is very small. And against the 3 billion 
base pairs of DNA and 20,000 genes found in the human nucleus, the 
mitochondrial genome can seem pretty insignificant (see ‘A complicated 
relationship’). Inherited solely through a mother’s egg, it comprises fewer 
than 17,000 base pairs and just 37 genes. But one cell can have thousands 
of copies of the mitochondrial genome, compared with just two of the 
nuclear genome — one from mum and one from dad. 

Mitochondrial DNA also accumulates mutations incredibly fast, at 
about ten times the rate of nuclear DNA — and geneticists can use the 
resulting variation as a sort of molecular clock. The clock has allowed 
scientists to create a human family tree that shows several broadly related 
mitochondrial genomes, known as haplogroups, emerging in Africa 
somewhere around 150,000 years ago, including two that gave rise to the 
thousands of smaller haplogroups now found around the world. 

The standing view was that the genetic differences between 
mitochondria in these groups were little more than a reflection of past 
migrations. But during the 1980s, researchers began to challenge that 
assumption. “Mitochondria control a central 
component of metabolism,’ says David Rand, 
an evolutionary biologist at Brown University 
in Providence, Rhode Island. “So it followed 
that this variation ought to be very interesting.” 

One way to examine whether mitochondria 
in one population work differently from those 
in another is to swap them. Such experiments 
would be unethical in people and impracti- 
cal in many other animals, so Rand turned to 
fruit flies. He cross-bred two fly strains with 
different mitochondria and then repeatedly 
back-crossed them until the mitochondria 
from one were neatly paired with the nucleus 
of the other. 

He then put fruit flies with similar nuclear 
genomes but different mitochondria together 
in a cage, and found that flies with specific 
mitochondrial genomes would quickly come to dominate the popula- 
tion’. Something in the mitochondria was giving them a survival advan- 
tage. Subsequent work by Rand, Dowling and others has shown that it 
is not just the mitochondrial genome, but rather its interaction with the 
nuclear one that seems to be affecting a range of traits, including lifespan, 
reproductive success, rate of development, ageing, growth, movement, 
morphology and behaviour. 

The findings extended beyond inbred laboratory animals such as fruit 
flies and mice. Over the past two decades, Ron Burton at the Scripps 
Institution of Oceanography in La Jolla, California, has found that cross- 
breeding closely related populations of tiny crustaceans known as cope- 
pods from tide pools on the Pacific coast often leads to a massive fitness 
breakdown for the animals’. Two clues led Burton to suspect that the rea- 
son was a mismatch between nuclear and mitochondrial DNA. First, the 
populations had very different mitochondrial genomes. Second, energy 
production was at the heart of all the sickly organisms’ deficiencies. 

The clincher came when Burton chose females from the unhealthy 
animals and mated them with males from the same population as the 
females’ mothers. The resulting offspring, which once again had a natu- 
ral combination of mitochondrial and nuclear genomes, were healthy. 
“That's pretty striking, says Burton. “And we did it with multiple different 
crosses.” 

Extending these results to mammals has been difficult: Roubertoux’s 
mitochondrially mismatched mouse lines took more than 20 generations 
and 12 years to develop. But there are a few studies that have found simi- 
lar results. Douglas Wallace, who heads the Center for Mitochondrial 
and Epigenomic Medicine at the Children’s Hospital of Philadelphia, 
combined the nucleus from a lab-mouse strain with mitochondria from 
a mouse known to contain two different, but normal, mitochondrial 


“They’re not 
just power 
factories, they’re 
also in a sense a 
nerve centre, a 
thermostat for 
the cell and how 
it is doing.” 
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genomes. His group found that the modified mice had altered circadian 
rhythms — the natural oscillations that follow a roughly 24-hour 
cycle — performed worse in mazes and seemed more stressed in certain 
experimental conditions, compared with unmodified animals’, 

In humans, there is only indirect evidence that the common variation 
found in the mitochondrial genomes of healthy individuals could have 
biological effects. Certain mitochondrial haplotypes have been linked 
to disorders such as type-2 diabetes, Parkinson's disease and cancer, and 
normal variation in the mitochondria is thought to influence general 
physical traits such as longevity and elite athleticism’. 

“Correlations are just correlations,’ says Goran Arnqvist, an evolution- 
ary biologist at Uppsala University in Sweden, “but there's now a large 
enough number of them to in itself provide ample evidence that there’s 
something going on with mitochondrial DNA’ 


Powerhouse pairing 

The question remains exactly how these variations could affect such a 
broad range of biological functions. Part of the answer seems to lie in 
their ties with the nuclear genome. Roughly 1,500 nuclear genes are 
involved in mitochondrial function, including 
around 76 that encode proteins which bind to 
mitochondrially derived peptides. 

Common variants could alter how these 
proteins interact. Ifa mitochondrially derived 
protein needs to fit snugly against a nuclear 
counterpart, even tiny changes in one partner 
could disrupt that binding, a possibility sup- 
ported by 3D modelling®”. 

A study published in 2009 compared mito- 
chondria from two common human Euro- 
pean lineages, called haplogroups J and H, in 
cells with the same nuclear DNA®. It showed 
that cells with haplogroup J mitochondria 
contained more than twice as many copies 
of mitochondrial DNA as those with haplo- 
group H, a difference that would be expected 
to have a big influence on the production of 
mitochondrial proteins. 

Such effects could alter the rate at which mitochondria supply energy, 
with consequences for many cellular activities. But emerging evidence 
points to other ways that mitochondria could have broad biological 
implications. 

Various molecules created during the production of energy, such as 
free radicals, may have a direct influence on processes involved in age- 
ing, inflammation and in some basic cell functions. And in May, a team 
of researchers led by Gerald Shadel at Yale University in New Haven, 
Connecticut, showed in mice that mitochondrial DNA can itself trigger 
an innate immune response against viral infection’. “They're not just 
power factories,” says Rand. “They’re also in a sense a nerve centre, a 
thermostat for the cell and how it’s doing” 

Researchers have also found evidence for a new class of 
mitochondrially derived peptide that might be encoded by sequences 
in other mitochondrial genes. One of these is humanin, a small peptide 
discovered by Japanese researchers in 2001 that increases sensitivity to 
insulin in diabetes-prone rats and mice’®. The gene that encodes it is 
thought to reside in the mitochondrial gene for 16S ribosomal RNA. 
In March, researchers in the United States found a second potential 
example, MOTS-c, which is encoded bya small stretch of DNA tucked 
away in another gene. MOTS-c functions like a hormone, and when 
injected into mice helps to enhance insulin sensitivity and protect 
against obesity". 

Some researchers now suspect that mitochondrial DNA produces a 
vast array of biologically active molecules — other small peptides as well 
as short stretches of RNA — that are part of a network of cross-communi- 
cation between the mitochondrial and nuclear genomes. “The very viabil- 
ity of complex life — eukaryote life — depends ona really coordinated, 
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A complicated relationship 


The.mitochondrial genome has evolved in concert with the nucleus of 
complex cells for hundreds of millions of years. Evidence suggests that 
even slight disruption of that relationship could have unexpected effects. 


Mitochondrial genome 
* 17,000 base pairs 

* 37 genes 

* Thousands of copies per cell 
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Swapping mitochondria 
in flies has affected the 
expression of more than 
1,000 nuclear genes, 
many unrelated to 
mitochondrial function. 


Of these, 76 produce 
proteins that bind to 
mitochondrially derived 
peptides. Gene variants 
can disrupt this binding. 
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intimate set of interactions between these two genomes,” Dowling says. 
Itisa partnership that has shaped and been shaped by aeons of evolution. 

Given how well evolution has tuned this communication, many 
biologists are concerned about disrupting it in mitochondrial replace- 
ment therapy. The results of mitochondria-swapping experiments in 
other organisms, they say, should not be overlooked. “We haven't seen 
anything fundamentally different between flies and humans in terms 
of interactions between the mitochondria and the nucleus,” says Klaus 
Reinhardt, an evolutionary biologist at the University of Tubingen in 
Germany. 

The health effects may not be dramatic, says Burton, and they might 
not become apparent until decades after birth. “But I think there’s a 
definite possibility that youd see things like disrupted fertility function, 
various forms of metabolic syndromes and changes in things that relate 
to metabolism in general” 


Call for caution 

Reinhardt, Dowling and Morrow outlined their concerns in a 2013 
paper” in Science. They called for studies aimed at addressing how 
mammals born after mitochondrial replacement fare in adulthood, 
and argued that scientists should at least look into haplotype match- 
ing — ensuring that the mitochondria from the donor and recipient 
come from the same haplogroup before transplant. Moving ahead at this 
juncture, they argued, “would place an experimental risk on families”. 

But other researchers disagree. Scientists at Newcastle University, 
UK, and at Oregon Health & Science University (OHSU) in Beaverton, 
two institutions that pioneered mitochondrial replacement therapies, 
pointed to perfectly healthy macaque monkeys born at OHSU in 2009 
after the procedure”. 

They also pointed out that most of the evidence for risk stems from 
studies that used strains of flies and mice that had been highly inbred 
— a process that would increase the genetic differences between the 
strains and therefore produce a greater ‘mismatch’ when the mitochon- 
dria are swapped. They argued that such studies have little relevance for 
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human populations that interbreed all the time. The “lack of any reliable 
evidence of mitochondrial—nuclear interaction as a cause of disease in 
human outbred populations’, they wrote, “provides the necessary reas- 
surance to proceed”. 

Doug Turnbull, who heads the Newcastle group, also argues that 
correlations between different human mitochondrial haplotypes and 
common diseases are not definitive. “If we're struggling to finda signal,” 
he says, “is that really something that’s likely to cause major difficulties?” 

Ultimately, government approval hinged on a 2014 report prepared by 
a scientific review panel set up by the Human Fertilisation and Embryol- 
ogy Authority (HFEA), the body that regulates assisted-reproduction 
treatments in the United Kingdom. The panel’s chair, Andy Greenfield 
of the Medical Research Council, would not comment for this story, 
but the HFEA provided a written response to questions. It stated that 
deliberations were “time-consuming and as complex as the data them- 
selves’, adding that most respondents presenting evidence to the panel 
viewed these issues as “at best minor or non-existent”. In its final report, 
the panel recommended that haplogroup matching be considered “as 
a precautionary step”. But it also stated that the benefits of doing so are 
“likely to be minimal”. 

Some of the critics of the decision grant that mitochondrial replacement 
may be worth the risks for women who want to avoid passing rare and 
devastating disorders on to their children. Many, however, think that more 
time is needed to assess the risks. There is also concern that proponents of 
the therapy trivialized the role of mitochondria — particularly by likening 
mitochondrial replacement to changing the batteries in a camera. Critics 
argue that a failure to appreciate all the other processes in which the orga- 
nelle is involved could lead to inadequate controls and wider application 
of mitochondrial replacement in fertility clinics. 

“You may have a few thousand people who suffer from mitochon- 
drial diseases,” says David Keefe, a reproductive biologist at New York 
University’s Langone Medical Center. “There are tens of millions of 
women who have infertility who may see this as a way to have the bat- 
teries charged in their eggs.” 

At least one clinic in the United States has used cytoplasm from donor 
eggs to ‘normalize’ the eggs of women being treated for infertility, start- 
ing in the late 1990s (see Nature 509, 414-417; 2014). The procedure, 
which probably transferred mitochondria as well, resulted in 17 births 
before the US Food and Drug Administration requested safety studies 
and the clinic stopped offering the procedure in 2001. Little is known 
about the health of the children born as a result of the procedure. 

Turnbull rejects the slippery-slope argument. “In the UK, the 
legislation is very clear that mitochondrial donation can only be used 
to prevent serious mitochondrial disease,’ he says. “I do not think there 
is any good evidence it would be useful for anything else.” 

Although no one knows what the rapidly growing field of 
mitochondrial research will uncover next, both sides agree that there is 
no way to say for sure what will happen when doctors swap mitochon- 
dria in humans, short of actually doing it. For Dowling, at least, it is one 
scientific debate that he would rather not win. “Td like to see this work so 
female sufferers of mitochondrial disease can have unaffected children,” 
he says. “So I hope we're wrong.” m SEE EDITORIAL P.425 


Garry Hamilton is a science writer in Seattle, Washington. 
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The Redox Cube is a planned 25-kilowatt fuel cell to be run on natural gas. 


Reimagine 
fuel cells 


Combine energy generation and storage to ensure that 
networks remain robust as more renewable technologies 
are adopted, urge John P. Lemmon. 


wind turbines are soaring as costs fall 
and governments and companies seek 
to reduce greenhouse-gas emissions. But 
fluctuating power from the wind and sun 
threatens to destabilize electricity grids. As 
more intermittent sources are connected, 
the power surges and crashes. This increases 
variability in voltage, in power and in the 
frequency of alternating current. 
Already, Germany, which produces more 


| in solar photovoltaics and 


than 25% of its energy from renewables, is 
experiencing problems. Voltage glitches, by 
tripping crucial components and destroy- 
ing equipment in factories and plants, have 
caused hundreds of thousands of euros of 
damage’. Using coal plants to maintain 
stability adds greenhouse-gas emissions. 
Generation and load must be balanced. 
Three approaches are in operation: using 
real-time demand and pricing incentives to 
control load; ramping natural-gas plants up 


or down to compensate for fluctuating power; 
and storing energy. Each has downsides. 
Repeated requests to reduce demand agitate 
users and may be manipulated by third parties 
who stand to profit. Gas turbines and batter- 
ies cannot provide rapid (less than a second) 
high-power responses and supply energy 
for long periods. Batteries degrade and are 
expensive to replace. Combinations of bat- 
teries require multiple sets of electronics and 
control systems. 

New types of fuel cell on the horizon could 
eliminate the need for such trade-offs and 
ease the integration of renewables into the 
grid. Currently, fuel cells are used to generate 
only electricity and heat. They can be modi- 
fied to store energy and produce liquid fuels 
such as methanol, thanks to breakthroughs 
in materials and designs. Developing fuel 
cells with a battery mode is one focus of the 
programme I direct at the US Advanced 
Research Projects Agency-Energy (ARPA-E). 
Ilead 13 projects across academia, industry 
and national laboratories. 

Researchers must now demonstrate that 
fuel cells can perform multiple functions and 
still generate power efficiently. 


INTEGRATION CHALLENGE 

Power generation from distributed sources 
is expanding rapidly. Up to three times 
the current capacity of solar photovoltaics 
is projected’ to come online in the United 
States by 2040 (an extra 15-75 gigawatts; 
GW). But conventional power grids are not 
designed to handle hundreds of thousands 
of small variable sources. 

Renewables have two types of intermit- 
tency. First, output may fluctuate randomly 
moment-to-moment — for example, because 
of clouds casting shadows on solar panels. 
Second, output varies predictably through 
the day and night. According to one projec- 
tion by the California Independent System 
Operator, the power needs for a late Califor- 
nia afternoon in March 2020, when the sun is 
setting and people are returning home, will 
require a 13 GW ramp up over 3 hours — 
that is equivalent to switching on more than 
20 power plants of 600-megawatt capacity 
(see ‘Daily load’). 

Actions to decrease demand — such 
as changing thermostat settings by a few 
degrees — are often the first deployed when 
the grid is pushed to its limit on, say, a 
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DAILY LOAD 


The difference between power demand and generation varies thoughout the day as sunshine and 
wind change and people travel to and from home. More power plants must be switched on in the 
evening. This imbalance is projected to increase as more renewables are used. 
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> sweltering summer day ina large city. Such 
‘demand response’ can potentially shed up to 
9% of the peak load in the United States’. But 
policy uncertainties make demand response 
fraught for grid system operators. 

In May 2014, the US Court of Appeals 
struck down a 2011 order by the Federal 
Energy Regulatory Commission (FERC) 
that required grid operators to pay demand- 
response resources the full market price for 
energy, just as they would for conventional 
power generators. The court ruled that 
because demand response constitutes a 
retail market, it should be subjected to state 
(rather than FERC) jurisdiction. President 
Barack Obama appealed the ruling and the 
US Supreme Court will hear the case later 
this year. Either way, demand-side measures 
alone cannot accommodate the amount of 
renewables projected to be installed in the 
United States in the coming decades. 

The latest natural-gas turbines are designed 
with a buffering capacity, to help to smooth 
out the power output from renewables. They 
get up to speed quickly, with ramping rates 
of tens of megawatts per minute. Still, it 
takes minutes — not seconds — to produce 
power froma standing start. Turbines that are 
already spinning can be brought into opera- 
tion within seconds but with lower efficiency, 
higher emissions and, in the United States, 
13-24% higher operating costs’. 

The use of batteries for grid energy storage 
is receiving more attention. Energy compa- 
nies are becoming increasingly confident 
that such devices are reliable in the field. 
Regulations are becoming more favourable. 
California has mandated 1.32 GW of storage 
capacity by 2020 and its utility companies 
have begun to buy storage devices. 

One milestone was an announcement 
in May from the electric-car maker Tesla 
Motors, of Palo Alto, California. It has devel- 
oped 7- and 10-kilowatt-hour (kWh) residen- 
tial battery systems costing up to US$3,500. 
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The company hopes that homeowners will 
want to avoid power cuts and paying peak 
electricity rates by charging the battery with 
solar panels or when grid prices are off-peak. 
Buta battery’s storage capacity is limited by 
its size. Tesla’s 10 kWh, 1-metre-square, lith- 
ium-ion battery panel powers a home for one 
day, not enough for power cuts such as those 
following hurricanes and storms. Further 
improvements to cost and performance will 
come from higher-volume manufacturing of 
lithium-ion batteries and competing tech- 
nologies such as redox (reduction-oxidation) 
flow batteries. No single technology provides 
the optimal balance. 

Existing technologies such as improved 
wind-forecasting models, expansion of trans- 
mission infrastructure and other measures 
may help grids to remain stable as renewables 
penetration increases”. But larger grids (tens 
of megawatts or more) operating with more 
than 50% distributed generation are unprec- 
edented. The adequacy of current technolo- 
gies to support such a grid is unknown. 


HYBRID FUEL CELLS 
I back a different approach: fuel cells with 
built-in charge storage. Fuel cells have been 
touted for energy production for decades 
owing to their high electrical efficiencies. 
They have not been widely adopted because 
they are more expensive than combustion 
generators ($3,000 per kilowatt, compared 
with $1,000 per kilowatt). Over the past 
15 years, US government-funded research 
programmes have aimed to lower those 
costs. Enhancing fuel-cell functionality 
would make them even more valuable. 
Fuel cells are electrochemical devices 
similar to batteries that rely on a substance 
such as hydrogen or methane to produce 
power and heat. Two main types are in use: 
polymer fuel cells that operate at around 
80°C for vehicles; and solid-oxide fuel cells 
that operate above 650°C for stationary 
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power. Polymer fuel cells require expensive 
platinum catalysts; solid-oxide fuel cells 
need expensive seals and connectors and 
have a limited lifetime. Adding functionality 
is difficult at either extreme of temperature. 

Existing fuel cells are slow to respond to 
changes in current and voltage, taking more 
than a second. Hot fuel cells (operating 
above 650 °C) must avoid stressing their con- 
stituent materials, and their fuel processors 
take up to a minute to alter the rate at which 
methane is converted to hydrogen, carbon 
monoxide and carbon dioxide. Cells degrade 
if they are ‘starved’ of fuel. 

Storing charge in or near a fuel cell’s elec- 
trode would speed up the device's respon- 
siveness, allow it to be ‘recharged’ without 
stopping and to live longer than conventional 
fuel cells. The battery-mode concept has been 
demonstrated in the lab in (cold) alkaline fuel 
cells with metal hydride anodes®” and a (hot) 
solid-oxide fuel cell with vanadium oxide 
electrodes®. Although the power densities 
reported were low, the cells stored voltage for 
minutes or hours. More materials research 
is required to increase the power output and 
reduce energy losses at the anode, which 
reacts with the fuel and stores charge. 


INCREASING FUNCTIONALITY 

Another function that could be integrated 
into fuel cells is the electrochemical conver- 
sion of natural gas (methane) into liquid 
fuels such as methanol. Current gas-to- 
liquid (GTL) technologies are economical 
only at large scales; Shell’s Pearl GTL plant 
in Qatar processes up to 45 million cubic 
metres of gas a day. 

Rather than switching off solar panels or 
wind turbines when they are not needed, 
the spare electrons generated when supply 
exceeds demand could be directed towards 

making liquids for 


“Fyel cells have _ ttansportation fuels 
been touted or chemicals. At 

natural-gas wells, 
f ane nfor such fuel cells could 
ad a convert gas that 


would otherwise 
be flared or vented. 
Such devices have been demonstrated in the 
lab in the past five years but are not yet com- 
mercially viable’. Researchers need to over- 
come the inherent stability of the methane 
molecule and convert it such that it is neither 
oxidized fully to CO, nor converted back to 
methane from an intermediate state. 

I believe that both charge storage and fuel 
production can be incorporated cheaply into 
fuel cells that operate at intermediate tem- 
peratures. In the 200-500°C range, many 
compounds capture and release hydrogen 
(magnesium hydride is one). These condi- 
tions are also more suitable for GTL con- 
version: hot enough for reasonable reaction 
kinetics, yet not so hot that methane oxidizes 
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completely to CO,. Materials research on 
proton and oxygen ion conductors in the 
past decade shows that such fuel cells are 
possible. 

For intermediate-temperature hybrid 
fuel cells to become a reality, researchers 
need to create solid electrolytes with high 
conductivity, and find electrode materials 
that have high activity and stability and 
that react with methane without form- 
ing coke (solid carbon). These devices 
must use less platinum catalyst and more 
impure fuel than low-temperature poly- 
mer cells; make do with cheaper seals and 
connectors; and last longer than higher- 
temperature solid-oxide fuel cells. 

US researchers have made a start. I 
launched the Reliable Electricity Based on 
Electrochemical Systems (REBELS) pro- 
gramme at ARPA-E with $33 million in 
funding in June 2014. It is starting to bear 
fruit’. Efforts elsewhere, particularly in 
Europe and Japan, are addressing hydro- 
gen generation and GTL separately but 
could also benefit from hybrid fuel cells. 

Researchers should prove the viability of 
intermediate-temperature fuel cells with 
these extra functions by demonstrating 
high power density and a lifetime of ten 
years, compared with current cell lifetimes 
of less than five. Cost savings must be vali- 
dated through rigorous techno-economic 
modelling. Advances will then need to be 
scaled up from individual cells to kilowatt- 
scale systems, which will take 5-10 years. 

Regulators, utility companies, technol- 
ogists and users must define an appropri- 
ate mix of technologies and incentives to 
maintain the stability of the electricity 
grid in the coming decades. Hybrid fuel 
cells must be part of that conversation. m 


John P. Lemmon is a programme director 
at the US Department of Energy Advanced 
Research Projects Agency-Energy 
(ARPA-E), Washington DC, USA. 

e-mail: john.lemmon@hq.doe.gov 
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Democracy is not 
an inconvenience 


Climate scientists are tiring of governance that does not 
lead to action. But democracy must not be weakened in 
the fight against global warming, warns Nico Stehr. 


T here are many threats to democracy 
in the modern era. Not least is the 
risk posed by the widespread public 
feeling that politicians are not listening. 
Such discontent can be seen in the politi- 
cal far right: the Tea Party movement in the 
United States, the UK Independence Party, 
the Pegida (Patriotic Europeans Against the 
Islamization of the West) demonstrators in 
Germany, and the National Front in France. 

More surprisingly, a similar impatience 
with the political elite is now also present 
in the scientific community. Researchers 
are increasingly concerned that no one is 
listening to their diagnosis of the dangers 


of human-induced climate change and 
its long-lasting consequences, despite the 
robust scientific consensus. As govern- 
ments continue to fail to take appropriate 
political action, democracy begins to look 
to some like an inconvenient form of gov- 
ernance. There is a tendency to want to take 
decisions out of the hands of politicians 
and the public, and, given the ‘exceptional 
circumstances, put the decisions into the 
hands of scientists themselves. 

This scientific disenchantment with 
democracy has slipped under the radar 
of many social scientists and commen- 
tators. Attention is urgently needed: > 
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> the solution to the intractable ‘wicked 
problem’ of global warming is to enhance 
democracy, not jettison it. 


VOICES OF DISCONTENT 

Democratic nations seem to have failed 
us in the climate arena so far. The past 
decade's climate summits in Copenhagen, 
Cancun, Durban and Warsaw were politi- 
cal washouts. Expectations for the next 
meeting in Paris this December are low. 

Academics increasingly point to democ- 
racy as a reason for failure. NASA climate 
researcher James Hansen was quoted 
in 2009 in The Guardian as saying: “the 
democratic process doesn’t quite seem 
to be working”’. In a special issue of the 
journal Environmental Politics in 2010, 
political scientist Mark Beeson argued’ 
that forms of ‘good’ authoritarianism “may 
become not only justifiable, but essen- 
tial for the survival of humanity in any- 
thing approaching a civilised form”. The 
title of an opinion piece published earlier 
this year in The Conversation, an online 
magazine funded by universities, sums up 
the issue: “Hidden crisis of liberal democ- 
racy creates climate change paralysis’ 
(see go.nature.com/pqgysr). 

The depiction of contemporary democ- 
racies as ill-equipped to deal with climate 
change comes from a range of considera- 
tions. These include a deep-seated pes- 
simism about the psychological make-up 
of humans; the disinclination of people to 
mobilize on issues that seem far removed; 
and the presumed lack of intellectual com- 
petence of people to grasp complex issues. 
On top of these there is the presumed 
scientific illiteracy of most politicians 
and the electorate; the inability of govern- 
ments locked into short-term voting cycles 
to address long-term problems; the influ- 
ence of vested interests on political agen- 
das; the addiction to fossil fuels; and the 
feeling among the climate-science com- 
munity that its message falls on the deaf 
ears of politicians. 

Such views can be heard from the high- 
est ranks of climate science. Hans Joachim 
Schellnhuber, founding director of the 
Potsdam Institute for Climate Impact 
Research and chair of the German Advi- 
sory Council on Global Change, said of 
the inaction in a 2011 interview with Ger- 
man newspaper Der Spiegel: “comfort and 
ignorance are the biggest flaws of human 
character. This is a potentially deadly mix”. 

What, then, is the alternative? The solu- 
tion hinted at by many people leans towards 
a technocracy, in which decisions are made 
by those with technical knowledge. This can 
be seen in a shift in the statements of some 
co-authors of Intergovernmental Panel on 
Climate Change reports, who are moving 
away from a purely advisory role towards 
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policy prescription (see, for example, ref. 3). 

We must be careful what we wish for. 
Nations that have followed the path of 
‘authoritarian modernization, such as 
China and Russia, cannot claim to have 
a record of environmental accomplish- 
ments. In the past two or three years, 
China’s system has made it a global leader 
in renewables (it accounts for more than 
one-quarter of the planet’s investment in 
such energies*). Despite this, it is strug- 
gling to meet ambitious environmen- 
tal targets and will continue to lead the 
world for some time in greenhouse-gas 
emissions. As Chinese citizens become 
wealthier and more educated, they will 
surely push for more democratic inclusion 
in environmental policymaking. 

Broad-based support for environmen- 
tal concerns and subsequent regulations 
came about in 
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and adaptability’. 
Democratic nations 
have forged the most effective international 
agreements, such as the Montreal Protocol 
against ozone-depleting substances. 


GLOBAL STAGE 

Impatient scientists often privilege hegem- 
onic players such as world powers, states, 
transnational organizations, and multina- 
tional corporations. They tend to prefer 
sweeping policies of global mitigation over 
messier approaches of local adaptation; for 
them, global knowledge triumphs over local 
know-how. But societal trends are going in 
the opposite direction. The ability of large 
institutions to impose their will on citizens 
is declining. People are mobilizing around 
local concerns and efforts’. 

The pessimistic assessment of the ability 
of democratic governance to cope with and 
control exceptional circumstances is linked 
to an optimistic assessment of the potential 
of large-scale social and economic planning. 
The uncertainties of social, political and eco- 
nomic events are treated as minor obstacles 
that can be overcome easily by implementing 
policies that experts prescribe. But human- 
ity’s capacity to plan ahead effectively is lim- 
ited. The centralized social and economic 
planning concept, widely discussed decades 
ago, has rightly fallen into disrepute’. 

The argument for an authoritarian politi- 
cal approach concentrates on a single effect 
that governance ought to achieve: a reduc- 
tion of greenhouse-gas emissions. By focus- 
ing on that goal, rather than on the economic 
and social conditions that go hand-in-hand 
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with it, climate policies are reduced to 
scientific or technical issues. But these are 
not the sole considerations. Environmental 
concerns are tightly entangled with other 
political, economic and cultural issues that 
both broaden the questions at hand and 
open up different ways of approaching it. 
Scientific knowledge is neither immediately 
performative nor persuasive. 


ENHANCE ENGAGEMENT 

There is but one political system that is able 
to rationally and legitimately cope with 
the divergent political interests affected by 
climate change and that is democracy. Only 
a democratic system can sensitively attend 
to the conflicts within and among nations 
and communities, decide between different 
policies, and generally advance the aspira- 
tions of different segments of the popula- 
tion. The ultimate and urgent challenge is 
that of enhancing democracy, for example 
by reducing social inequality*. 

If not, the threat to civilization will be 
much more than just changes to our physi- 
cal environment. The erosion of democracy 
is an unnecessary suppression of social 
complexity and rights. 

The philosopher Friedrich Hayek, who 
led the debate against social and economic 
planning in the mid-twentieth century’, 
noted a paradox that applies today. As 
science advances, it tends to strengthen the 
idea that we should “aim at more deliber- 
ate and comprehensive control of all human 
activities”. Hayek pessimistically added: “It 
is for this reason that those intoxicated by 
the advance of knowledge so often become 
the enemies of freedom”’®. We should heed 
his warning. It is dangerous to blindly 
believe that science and scientists alone can 
tell us what to do. = 


Nico Stehr is a sociologist and founding 
director of the European Center for 
Sustainability Research at Zeppelin 
University in Friedrichshafen, Germany. 
e-mail: nico.stehr@t-online.de 
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A hypersonic cruise missile engine developed for the US Defense Advanced Research Projects Agency. 


Inventions of war 


Ann Finkbeiner assesses a study of DARPA, the agency 
that readies US technologies for coming conflicts. 


he research arm of the US Depart- 
ik of Defense is called the 
Defense Advanced Research Projects 
Agency — a disconcerting combination of 
words. DARPA is small, but it predicts what 
future wars might look like and gets the nec- 
essary technologies ready. It has had stun- 
ning successes, including the foundations 
of the Internet, satellites for reconnaissance 
and global positioning (called Corona and 
Transit, respectively) and stealth technology. 
DARPA also has a particular character: cre- 
ated in reaction to the Soviet Union’s surprise 
launch of the Sputnik satellite, it is meant to 
prevent “technological surprises” through 
high risk, high pay-off research. The story of 
this entity, in business for 57 years, should be 
all kinds of interesting. 
But how to tell it? Much of DARPA’ work 
is classified, so any history will necessarily 
be gappy. And whereas the agency’s virtue 


is its responsiveness to changes in warfare, 
that flexibility means that its history goes 
off in all directions. DARPA does not do 
research itself: it funds and works through 
a confusing variety of research centres, as 
well as industrial and academic bodies. The 
missions and titles of its internal offices vary 
with time. Even its name has switched peri- 
odically from ARPA to DARPA (‘Defense’ 
was first added in 1972, to emphasize the 
military nature of the agency’s research). The 
result is a writer’s nightmare, a story with a 
limitless cast of characters and no obvious 
storyline. In The Pentagon’ Brain, journalist 
Annie Jacobsen has tackled the job by telling 
successive stories, focusing mostly on one 
programme at atime. Given DARPAs inher- 
ent complexity, this piecemeal approach 
leads to a struggle to capture the whole. 
The bare bones of DARPA’ story are 
as follows. It began in 1958, with two 
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physics-based cold war programmes: VELA, 
meant to detect nuclear explosions as verifi- 
cation for a treaty banning nuclear tests; and 
DEFENDER, which worked on defences and 
counter-defences against enemy missiles. By 
the time of the Vietnam War in the 1960s, 
war had become transformed into counter- 
insurgency, combating guerillas and con- 
verting citizens into collaborators. DARPA 
responded with AGILE, a programme that, 
in part, deployed psychology, sociology and 
anthropology to seek information on Viet- 
namese culture. When battlegrounds moved 
from jungles to cities, DARPA branched 
out into sensors and drones, then into 
computer-networked military operations, 
combating bioterrorism, robotics, artificial 
intelligence, human-machine interfaces and 
war simulations (C. Herzfeld Nature 451, 
403-404; 2008). 

In Jacobsen’s piece-by-piece telling, some 
of the stories are straightforward and firmly 
linked to DARPA. Others have gaps, prob- 
ably resulting from information being classi- 
fied, that she bridges with potentially related 
facts and speculation. In one of the straight- 
forward tales, DARPA backed Project 137, a 
1958 advisory meeting of academic scientists. 
One researcher, physicist Nick Christofilos, 
proposed an outlandish test to see whether 
nuclear explosions in the atmosphere could 
create a cloud of electrons that would be held 
by Earth’s magnetic field for long enough to 
stop incoming missiles. (The test was done 
in 1958; the cloud lasted for weeks.) Another 
unambiguous example is SIMNET, a multi- 
player digital war game that DARPA created 
in the early 1980s to simulate realistic air 
and land battles. By mid-1990, SIMNET had 
become a rehearsal for a war in the Middle 
East, with desert terrain, cities, tanks, aircraft 
and armies; according to General Norman 
Schwartzkopf, it “eerily paralleled” the actual 
Gulf War, which started later that year. 

For stories with holes, Jacobsen relies on 
implication, juxtaposing pieces of infor- 
mation that may or may not be related and 
trusting the reader to make the connections. 
Discussing DARPAs involvement in bioweap- 

- onry, for instance, she 
assembles the follow- 
ing elements. In 1992, 
Ken Alibek, a highly 
placed refugee from 
Soviet bioweapons 
programme Biopre- 
parat, briefed the 
Pentagon. DARPA 
recognized a need to 
invest in biology. In 
1996, the agency kick- 
started a biowarfare 
programme. In 1997, it 
asked the group of aca- 
demic science advisers 
known as JASON 


The Pentagon’s 
Brain: An 
Uncensored 
History of DARPA, 
America’s Top- 
Secret Military 
Research Agency 
ANNIE JACOBSEN 
Little, Brown: 2015. 
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> (A. Finkbeiner Nature 477, 397-399; 
2011) to report on whether it was feasible 
to engineer pathogens to become more 
lethal (for example, by making aerosolized 
anthrax); an unclassified summary of 
JASON’ report says it was. In 1999, Alibek 
became president of a company that aimed 
to find antidotes to bioweapons, which got 
a one-year contract from DARPA. Iam 
unsure what I am meant to make ofall this. 

Iam sure, however, that I am intended 
to view as ethically dubious DARPA’s 
decade-old launch of programmes com- 
bining artificial intelligence, autonomous 
robots and brain-computer interfaces. 
Jacobsen cites a JASON report saying that 
any research in this area would be subject 
to ethical regulation. Then, referring to 
work published in Nature on the capac- 
ity of the hormone oxytocin to foster trust 
(M. Kosfeld et al. Nature 435, 673-676; 
2005), Jacobsen wonders whether soldiers 
might be injected with the chemical to 
encourage them to trust robots. And after 
discussing DARPA%s sponsored research 
into limb regeneration and perhaps even 
human cloning, Jacobsen speculates 
on whether DARPA is trying to create 
autonomous hunter-killer robots. Such 
argument-by-juxtaposition is effective in 
fiction. In non-fiction, it is unconvincing. 

Ultimately, Jacobsen’s focus on 
DARPAs programmes sidesteps the more 
intractable subject of what DARPA is. She 
never addresses such obvious questions as 
how DARPA stays ahead of the next war, 
and whether its flexibility and respon- 
siveness have drawbacks, for example in 
the ratio of risk to pay-off. Furthermore, 
the book promises to “shine a light on 
DARPAs secret history” — secret because 
so many of the projects are classified. Yet 
the text, checked against sources, shows a 
certain amount of creative interpretation. 
Iknow from my own reporting on JASON 
that Jacobsen’s chapter on the electronic 
fence in Vietnam inspired by the group 
has little to do with DARPA, and that her 
assessment of JASON as generally central 
to DARPA’ programmes is exaggerated. 

However flawed, The Pentagon's Brain 
is an exciting read that asks an important 
question: what is the risk of allowing lethal 
technologies to be developed in secret? 
Jacobsen worries that the technology that 
DARPA helps to create “may itself outstrip 
DARPA as it is unleashed into the world” 
The prose might be caffeinated, but the 
message is serious, and has been since the 
first human picked up a rock and thought 
that it might be good for killing. = 


Ann Finkbeiner wrote The Jasons and 
is co-proprietor of a science blog, The 
Last Word on Nothing. 

e-mail: anniekf@gmail.com 
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An ejection seat and suit used on the Soviet Vostok missions from 1961 to 1963. 


SPACE TRAVEL 


When Soviets ruled the 
great beyond 


Tim Radford is thrilled by an unprecedented exhibition 
marking the USSR’s cold war feats in space. 


1966, the Soviet Union established pri- 

macy in space. Its heady list of triumphs 
embraces, in the 1950s alone, the first artificial 
object and first animal in orbit, and the first 
image of the far side of the Moon. In the next 
decade, it grew to include the first attempt on 
Venus, the first man in space, the first woman 
in space, the first three-man mission in space, 
and the first spacewalk, automaton touch- 
down on the Moon, lunar rover (1970), and 
scoop of Moon rock brought back to Earth by 
an automaton. Reflecting the significance and 
extent of those triumphs, the long-awaited 
Cosmonauts at the Science Museum in Lon- 
don assembles memorabilia and engineering 
marvels borrowed from around a score of 
Russian institutes. 


B etween the cold war years of 1957 and 


It opens with 
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Cosmonauts: Birthof and rocket vision- 


the Space Age ary Konstantin 
Science Museum, London. Tsiolkovs ky. It 
Until 16 March 2016. 


concludes with a 
recumbent man- 
nequin ina cradle (a “tissue equivalent phan- 
tom” flown in 1969 to absorb and measure 
space radiation), representing the Soviet 
dream of a crewed mission to Mars, anda 
quotation attributed to Tsiolkovsky: “Earth 
is the cradle of humanity, but one cannot live 
ina cradle forever.” In between is a parade 
of hardware that none of us who followed 
the news greedily in those years had ever 
dreamed we might see assembled in one 
place, let alone in South Kensington. 

The models are marvels. Here is a highly 
polished display model of Sputnik 1, launched 
in October 1957 (its chief designer, Sergei 
Korolev, reportedly said, “This ball will be 
exhibited in museums”). There are two engi- 
neering models: one of the two Lunokhod 
lunar rovers, the other of the once-secret 
lander Lunniy Korabl, designed to deliver 
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Ascale model of a 1960s Vostok spacecraft; a 1959 propaganda poster, In the name of peace; and a lower-body negative pressure suit from 1971. 


one cosmonaut to the lunar surface in 1969. 
It flew, but not to the Moon, and the rest of us 
knew of its existence only two decades later. 

And then there are the real things. Along 
with the charred, three-person Voskhod 1 
descent module used in 1964 is the descent 
module of Vostok 6. In it, cosmonaut 
Valentina Tereshkova orbited Earth for three 
days in 1963 before a return during which 
the heat shield was scorched by impact with 
Earth’s atmosphere at 27,000 kilometres an 
hour. This is iconic stuff: the RD-108 engine 
that powered the space race; the complex 
space toilet designed to drain human waste 
aboard the space station Mir; the powered 
backpack with port and starboard lights for 
free flight beyond the spacecraft. 


But what sets the scalp prickling are Ze 


. 


the little things that tell those other 
stories implicit in this dizzying show. 
There is Georgy Krutikov’s 1928 
drawing Labour Commune, a 
stratospheric dream prefigur- 
ing the great adventure. And 
there is a little metal mug once 
owned by Korolev, the man 
most people now recognize as 
the driver of the space race, 
and thus the hero of this story. 
Korolev, a Ukrainian, had been 
incarcerated in a prison camp in 
the Kolyma region of Siberia dur- 
ing Joseph Stalin's notorious 1930s 
purges. No Westerner — and few 
Russians — knew his name dur- 
ing the cold war, so closed was the 
Soviet world. Fresh from wartime 
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labour detention, he arrived at the German 
Peenemtinde base of the Nazi V-2 rocket 
programme to realize the dream of planetary 
exploration. 

Sputnik 1 jolted Western complacency 
and helped to reignite the US space pro- 
gramme originally launched by the aero- 
space engineer and Nazi-turned-émigré 
Wernher von Braun. When Korolev died in 
1966 during what should have been a rou- 
tine operation, the new Soviet leader Leonid 
Brezhnev was a pallbearer. Even then, no one 
in the West knew of Korolev’s existence. 

Inevitably, the rocket engineer’s genius 
surfaces again and again through the exhibi- 


tion. There isa letter signed by Stalin author- 
izing the intercontinental ballistic-missile 

programme that made Sputnik 1 possi- 
, ble, and the personalized number plate 
YG1, used by Yuri Gagarin, the foundry 
worker who became a fighter pilot 

and, in 1961, the first man in 
* space. There is Korolev’s freehand 
? EB drawing of the launch of canine 
cosmonauts Strelka and Belka. 


Socialist realist posters there is a 
white lab coat daubed in red with 
the Russian for “Space is ours”, a 
memento of a spontaneous 1961 
celebration in Red Square. The 

pencils and sketch pad that Alexei 
Leonov took on his pioneering 1965 
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spacewalk — a near-catastrophe — are here, 
along with a later self-portrait of him floating 
at the end ofa tether over the Black Sea. 

The United States’ role in the space race is 
hardly acknowledged, beyond a Time maga- 
zine cover declaring Soviet premier Nikita 
Khrushchev its 1957 man of the year. But the 
Soviet space effort seemed to lose momen- 
tum as the US Apollo programme — a story 
told in the Science Museum's main galleries 
— began in every sense to take off. Korolev’s 
death must also have been a factor. The won- 
ders went on, but the never-admitted race for 
the Moon was all but over. 

This cosmic cornucopia reflects the 
intoxication of those first years and looks 
forward to the age of the space station. 
There is a spoon used aboard Mir by Sergei 
Krikalev, the man who went up as a Soviet 
cosmonaut and came down in 1992 as a 
citizen of the Russian Federation (and yes, 
there is a Soyuz descent module that car- 
ried a Mir crew back to Earth that year). 
But this unprecedented collection delivers 
more than a glimpse of distant exploratory 
technologies. It is a snapshot of Soviet his- 
tory and, because the cold war warped the 
twentieth century, of global history, too. 
And where else could you see an ejector seat 
for a dog? The exhibits impose their own 
metaphors: see this show and be uplifted, 
transported, taken out of this world. It is the 
curatorial equivalent of a legal high. m 


Tim Radford was science editor of The 
Guardian in London until 2005. 
e-mail: radford.tim@gmail.com 
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Lab’s labour’s lost 


Philip Ball appraises Nicole Kidman’s stage turn as crystallographer Rosalind Franklin. 


r | The 1953 discovery of DNA’ structure 

by James Watson and Francis Crick is 

a triumphant narrative with an uneasy 
subtext. Rosalind Franklin's crystallographic 
work was a vital part of the evidence. Yet, 
although her results (and those of Maurice 
Wilkins) were published in the same issue 
of Nature as theirs, Franklin was denied 
adequate credit for years (see Nature 496, 270; 
2013). Watson and Crick never fully acknowl- 
edged the debt while she lived, and when she 
died at 37 of ovarian cancer, she effectively 
spared the Nobel committee the impossible 
decision of which trio to reward with the 1962 
prize in medicine or physiology. 

The question you need to ask yourself 
before seeing Photograph 51, Anna Ziegler’s 
play about Franklin and the race to pin down 
the double helix, is how you like your science- 
in-theatre. Do you insist on adherence to the 
historical record, or do you accept that the 
aim is to illuminate and interrogate themes? 
There is plenty here to upset the stickler — 
not least, the status of the titular X-ray diffrac- 
tion pattern of DNA, obtained by Franklin 
and PhD student Raymond Gosling at King’s 
College London and used as evidence for 
Watson and Crick’s double-helical model. 
Many of Ziegler’s liberties (such as bringing 
Franklin's illness forward, and implying that 
Wilkins was infatuated with her) serve the 
narrative without compromising the core 
issues. But casting photograph 51 asa eureka 
moment is awkward. 

Franklin did not fully interpret the image, 
for one thing. Nor did she take it (Gosling 
did), although that would not have been 
possible without her expertise. As Matthew 
Cobb writes in his excellent Life’s Greatest 
Secret (Profile, 2015), the image's significance 
has often been overstated, largely because 
Watson chose to play it up (“The instant I saw 
the picture my mouth fell open’) in his 1968 
The Double Helix (Athenaeum). 

In Watson’s book there was also a hint, 
made much of in later accounts, of some- 
thing underhand in how Wilkins — Gosling’s 
supervisor — showed Watson the photo in 
early 1953. That was not true, although 
certainly Wilkins had clashed terribly with 
Franklin. Should Ziegler have used Watson's 
first-hand but unreliable narrative at face 
value to inform the plot? One might argue 
that if Watson could decorate the truth for 
the sake of a good story, why shouldn't she? 

By adopting a dismissive tone towards 
Franklin, Watson's book inadvertently played 
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Nicole Kidman as Rosalind Franklin in Photograph 51. 


a big part in launching her as a feminist icon. 
And Ziegler’s play (which premiered in Los 
Angeles, California, in 2009) offers a more 
nuanced view of the myth. 

Ziegler’s players carry the story well. Wat- 
son (Will Attenborough) and Crick (Edward 
Bennett), naturally; the diffident Wilkins 
(Stephen Campbell Moore); Gosling (Joshua 
Silver) doing the PhD student's job of fill- 
ing in gaps and making the tea, figuratively 
and literally. US structural biologist Donald 
Caspar (Patrick Kennedy) almost draws the 
work-obsessed Franklin into her first — and 
only — relationship. Linus Pauling, Max 
Perutz and Lawrence Bragg stay offstage. So 
does a fair bit of the science: we never see the 
double helix, and the audience is left to make 
what it will of phosphates being on the inside 
or the outside of the structure. Thatis no fault 
in itself — we are spared blackboard primers. 
But the metaphors about base-pairing (as 
Caspar takes Franklin’s hand) or sexualized 
nestling of the twin strands are clunky. 

The play belongs to Franklin. But she 
is written as so buttoned-up, prickly and 
focused that it is easier to warm to the urbane 
Crick or even the impetuous Watson. And 
casting a big star brings its own compli- 
cations. Nicole Kidman’s performance is 
restrained, but the glamour that attends her is 
the opposite of what the part demands. More 
surface ordinariness would have left room for 
a glimpse of depths. 

Misogyny has loomed large in Franklin’s 
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tale ever since the 
feminist reading 
of Anne Sayre’s 
Rosalind Franklin 
and DNA (Nor- 
ton, 1975) — an 
interpretation that Franklin would have 
disavowed, her sister has said. Had Franklin 
been less excluded and patronized by her 
male peers, might she have had the feedback 
and confidence to solve the structure first? 
In her authoritative The Dark Lady of DNA 
(HarperCollins, 2002), Brenda Maddox chal- 
lenges that idea, suggesting that Franklin's 
class and religion (she came from a wealthy 
Jewish family) had an equal role in her isola- 
tion at King’s. Ziegler finds a good accom- 
modation: without any of the male characters 
becoming chauvinistic caricatures, we are left 
in no doubt that science was not welcoming 
to women in the 1950s. 

More contentious in both history and the 
play is how to think about Franklin’s science. 
Her experimental acumen is made clear; Kid- 
man spendsa lot of time at the lab bench. But 
what might have held Franklin back was that 
she did not trust model-building, believing 
that the structure must be revealed through 
mathematical analysis. Along with photo- 
graph 51, Watson and Crickassimilated other 
data, notably biochemist Erwin Chargaff’s 
observation that in DNA, the amounts of 
adenine and thymine bases, and of cytosine 
and guanine, are equal. Perhaps more impor- 
tantly, Watson, Crick and Pauling felt confi- 
dent enough to foul up. All three committed 
howlers in trying to get the prize — Pauling’s 
triple helix, published in early 1953, contained 
elementary errors. Ziegler’s Franklin would 
have been mortified by such blunders. 

That, perhaps, is the most valid message 
of Photograph 51. For science to thrive, there 
must be the freedom to fail. In Franklin’s 
time, it is not surprising that a female scien- 
tist would think that she could ill afford that 
luxury. Iam not at all sure that even a young 
Watson and Crick today could so freely take 
the risks they did. And shamefully, with evi- 
dence of gender imbalances in peer review 
and tenure, harassment and discrimination 
in the laboratory, and casual gender stereo- 
typing still deemed acceptable by some lead- 
ing scientists, the stakes remain still higher 
for a latter-day Franklin. » 


Photograph 51 

ANNA ZIEGLER 

Until 21 November 2015. 
Noé! Coward Theatre, 
London. 
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Popular uprising 
spreads science 


Hundreds of young Arab people 
are establishing initiatives to 
promote science in Arabic and 
raise scientific literacy across 
the Middle East, free of the 
censorship and bureaucracy 
of government and religious 
authorities (see Nature Middle 
East http://doi.org/7p8; 2015). 
They are publishing and 
translating scientific news and 
articles every day, including on 
topics such as evolution and 
sex education, which are widely 
taboo in many parts of the Middle 
East. Tens of thousands of reports, 
videos and infographics are 
popularizing a more objective 
way of thinking in the region. 
The movement relies on 
crowdsourcing from a vast pool 
of educated volunteers who are 
supervised by local scientists. 
This ‘uprising’ has reached 
millions of people ina relatively 
short time. One science- 
communication group, Syrian 
Researchers, celebrated its 
millionth follower on Facebook 
in mid-2015 (wwwssyr-res.com). 
Another group, Scientific Saudi 
(www.scientificsaudi.com), has 
more than 250,000 social-media 
followers and is a learning partner 
of the Arabic edition of the 
MIT Technology Review. 
Muath Alduhishy University of 
Queensland, Herston, Australia. 
Mouhannad Malek Babraham 
Institute, Cambridge, UK. 
dr.alduhishy@gmail.com 


Citizen projects can 
minimize conflicts 


Well-structured schemes for 
citizen scientists can minimize 
the potential for conflicts of 
interest (Nature 524, 265; 2015). 
Projects such as the UK 
Breeding Bird Survey (go.nature. 
com/keyvpu), run by the British 
Trust for Ornithology, use 
volunteer-friendly protocols and 
specify sampling at representative 
locations to standardize 
volunteer commitment. The 


primary motivator for observers 
is then whether to invest and 
participate in the survey, not 
whether they can influence 
which data are recorded. 

Citizen scientists participating 
in well-structured schemes are 
more likely to deliver cost- 
effective monitoring ona large 
scale and to improve societal 
understanding of scientific issues. 
James W. Pearce-Higgins British 
Trust for Ornithology, Thetford, 
UK. 
james.pearce-higgins@bto.org 


Resolve ambiguities 
in China’s emissions 


As the former chair of the 
Consultative Group of Experts 
organized by the United Nations 
Framework Convention on 
Climate Change (UNFCCC) 
to help developing countries 
to produce carbon-emission 
inventories, I question the claim 
that China’s emissions from coal 
have been overestimated (see 
Nature 524, 276; 2015 and Z. Liu 
et al. Nature 524, 335-338; 2015). 
The accuracy of estimates 
depends largely on emission- 
factor estimates for the coal 
China uses (emission factor is 
the amount of carbon oxidized 
per unit of fuel consumed). For 
example, Liu and colleagues 
report emission factors that 
were estimated from the average 
carbon content of a range of high- 
quality to low-quality Chinese 
coal types. They write that these 
emission factors are 40% below 
the default values recommended 
by the 2006 guidelines of the 
Intergovernmental Panel 
on Climate Change (IPCC). 
However, I find their comparison 
flawed because the IPCC factor 
they use derives from coking 
coal, which contains more carbon 
and so has a higher emission 
factor than an ‘average’ coal type. 
Furthermore, the conclusion 
by Liu et al. that China’s fossil- 
fuel use in 2000-12 exceeded 
official figures by 10% seems 
incompatible with the authors’ 
estimated emissions being 12% 


lower than those calculated by 
the Chinese government. The 
official team’s higher estimate 
was based on information from 
China's coal-quality database and 
from coal-trading contracts. 
Such ambiguities call for clear 
resolution so that estimates of 
China's emissions are accurately 
conveyed. 
Fei Teng Tsinghua University, 
Beijing, China. 
tengfei@tsinghua.edu.cn 


Overhaul rules for 
hazardous chemicals 


The huge chemical explosion 

at the Chinese port of Tianjin 

on 12 August is another in the 
country’s long list of industrial 
accidents involving chemicals. In 
2010-14, more than 2,000 people 
were killed in 326 such accidents 
(J. Ren and Y. Mu Chem. Enterp. 
Manag. 16, 28-31; 2015; in 
Chinese), calling into question 
the adequacy and enforcement 
of national regulations for the 
safe management of hazardous 
chemicals. 

China needs better legislation 
and more detailed regulations 
for controlling risk at different 
stages in the life cycle of 
hazardous chemicals. Safety 
supervision must be made 
more effective, for example by 
drawing up regulations modelled 
on the European Union's 
Seveso Directive for industrial 
accidents (see go.nature. 
com/zyjp85). Supervision 
is currently disorderly, 
overseen by a fragmented and 
overlapping structure of multiple 
agencies, including the State 
Administration of Work Safety 
and the ministries of transport, 
public security, environmental 
protection, agriculture and 
health. 

Effective enforcement of 
China's safety management of 
hazardous chemicals may be 
foiled by the limited expertise of 
company front-line managers 
(M. Liu et al. Contemp. Chem. 
Ind. 43, 2661-2662; 2014; 
in Chinese). More data are 


needed on the nature, handling 
and storage of the chemicals 
themselves so that risks can be 
properly assessed and managed. 
(see go.nature.com/bzsydq). 
Zhenwu Tang North China 
Electric Power University, Beijing, 
China. 

Qifei Huang, Yufei Yang 
Chinese Research Academy of 
Environmental Sciences, Beijing, 
China. 

huanggf@craes.org.cn 


Olympics will make 
water scarcity worse 


The 2022 Winter Olympics in 
Beijing threaten to seriously 
exacerbate water shortages in the 
area, where the available water 
per person is already only about 
3% of the world’s average (see 
also Nature 524, 278-279; 2015). 
The Winter Olympics will 
take place in February, when 
monthly precipitation in Beijing 
is less than 6 millimetres, so 
the games will need to rely 
exclusively on artificial snow 
(see go.nature.com/bkxbo8). 
This will entail pumping 
massive volumes of water out 
of reservoirs and rivers, further 
reducing the local population's 
water supply, and using huge 
amounts of energy to cool this 
water to make ice crystals that 
can be ejected. For example, 
producing artificial snow for 
winter sports in Beijing in 2010 
used the annual equivalent of the 
water and electricity consumed 
by 8,300 and 5,400 households, 
respectively (see go.nature.com/ 
ysdpbd; in Chinese). 
Hong Yang University of Oslo, 
Norway. 
Julian R. Thompson, Roger 
J. Flower University College 
London, UK. 
hongyanghy@gmail.com 
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STAP cells are derived from ES cells 


ARISING FROM H. Obokata etal. Nature 505, 641-647 (2014) doi:10.1038/nature12968; retraction 511, 112 (2014) doi:10.1038/nature13598; and 
H. Obokata et al. Nature 505, 676-680 (2014) doi:10.1038/nature12969; retraction 511, 112 (2014) doi:10.1038/nature13599 


Two reports claiming a novel cellular reprogramming phenomenon, 
stimulus-triggered acquisition of pluripotency (STAP), were published 
in Nature last year’”, but then subsequently retracted**. The identity of 
STAP cells and STAP-derived stem cells, however, has remained 
undetermined. Here we report the results of a whole-genome sequen- 
cing (WGS) investigation of STAP-related samples kept mainly at the 
RIKEN Center for Developmental Biology. We show that all purported 
STAP stem-cell lines were contaminated with embryonic stem (ES) 
cells, and that chimaeric mice and teratomas supposedly derived from 
STAP cells instead show ES cell contribution. 

The original article’ reported that exposure to low pH can repro- 
gram differentiated cells into unique pluripotent cells (STAP cells), 
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from which two secondary cell lines were established; ES-like STAP 
stem cells and trophoblast stem-like Fgf4-induced stem cells capable 
of generating placental cells’*. Because STAP cells were not main- 
tained as frozen stocks, we first performed WGS of 15 genomic DNA 
samples in total, including three representative STAP stem-cell lines 
with different genetic backgrounds, an Fgf4-induced stem-cell line, 
and seven ES cell lines established at the Wakayama laboratory before 
or during the STAP study (Extended Data Table 1). We determined 
genome-wide patterns of single-nucleotide polymorphisms (SNPs) 
that distinguish mouse strains 129/Sv (129) and C57BL/6 (B6), as well 
as green fluorescent protein (GFP) transgene types (Supplementary 
Methods and Extended Data Fig. 1a). No samples from the Oct4-GFP 
Fgf4-induced stem cells described in the original letter? were found 
(Oct4 is also known as Pou5f1). 

The STAP stem-cell line FLS and the Fgf4-induced stem-cell line 
CTS were reported to carry a homozygous insertion ofa single cag-gfp 
transgene with the genetic background of 129 female X B6 male 
(Extended Data Table 1). However, these cell lines had co-insertions 
of two GFP transgenes’, sperm-specific acrosin-promoter-gfp° and 
ubiquitously expressed cag-gfp’ (hereafter designated Acr/cag-gfp) at 
chromosome 3, which originated from an Acr/cag-GFP B6 mouse 
strain® not described in the STAP papers’”. These STAP cell lines 
were then compared with four ES cell lines—FES1, FES2, and two 
nuclear transfer ES lines (ntESG1 and ntESG2) (ref. 9)—established 
from crossing the Acr/cag-GFP mouse strain with 129 mice in the 
Wakayama laboratory in 2005 (Extended Data Fig. la and Extended 
Data Table 1). FES1 and FES2 cells shared homologous SNP patterns 
with these STAP cell lines over the entire genome, including the 129 X 
chromosome, while ntESG1 and ntESG2 cells bearing B6 X chro- 
mosome were excluded from the comparison. Furthermore, these 
STAP cell lines shared two genomic characteristics with FES1, but 
not FES2; first, two chromosomal deletions (Fig. 1a) are present only 
in FES1 and all Acr/cag-GFP STAP stem-cell sublines, but not in the 
other cell lines examined, in the paternal Acr/cag-GFP mice (frozen 
stock in 2010), or in potential maternal 129 substrains available in 
Japan. Second, FES1 and the STAP cell lines with Acr/cag-GFP share 


Figure 1 | STAP cells and STAP stem cells are derived from ES cells. a, Two 
genomic deletions exclusively shared by STAP and Fgf4-induced stem-cell lines 
and FES1 ES cells carrying Acr/cag-gfp. b, SNP mosaicisms of chromosome 
12 (Chr12) in Acr/cag-GFP* cells. The panel shows SNPs in the 1-megabase 
resolution. A large SNP mosaic region (a red rectangle) is different between 
FES1 and FES2 ES cells. All Acr/cag-GFP* STAP-cell lines have the same 
mosaicism as FES1 ES cells (see Extended Data Fig. 1 for Chr6 and Chr11). 
c, Acr-gfp, and two deletions (chromosomes 3 and 8) specific to FES1 ES cells 
are inherited into offspring from STAP cell chimaeric mice. d, e, Teratomas 
are derived from ES cells. qPCR reproducibly detects Acr-gfp (d), and FES1 
ES-cell-specific deletions (e) in genomic DNAs prepared from the STAP cell 
teratoma paraffin block. Lanes 1: STAP cell teratoma; 2: STAP cell teratoma 
(separately prepared); 3: FLS4 (Acr/cag-GFP* STAP stem cell); 4: 129B6F1 ES-5 
(control ES cell); 5: GLS13 (Oct4-GEP* STAP stem cell); 6: C57BL/6NCrSlc 
mouse; and 7: no template DNA. Each value shows fold-amplifications relative 
to the I/2 gene (see Supplementary Methods). f, DAPI staining of a section taken 
from the STAP cell teratoma paraffin block. The intestinal epithelium and 
pancreatic tissue in the rectangles correspond Fig. 2e and Extended Data 

Fig. 4c from ref. 1, respectively. g, h, Magnifications of the rectangles with 
immunostaining for enhanced GFP (eGFP), indicating that these tissues are 
derived from GFP-negative host tissues (white arrowheads). Scale bar, 1 mm. 
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large SNP clusters that differ between FES] and FES2 in three chro- 
mosomes (Fig. 1b and Extended Data Fig. 1b, c). These differential 
SNP clusters probably arose from chromosomal heterogeneity in the 
parental mouse colonies when FES1 and FES2 were established. It is 
highly unlikely that the Acr/cag-GFP STAP cell lines and FES1 all 
independently acquired these two unique deletions and inherited 
the same three mosaic chromosomes from parental mice. An ES cell 
stock, 129/GFP ES, was also found to share all these genomic features 
(Extended Data Table 1). 

After the above three SNP clusters reflecting parental heterogeneity 
are excluded, the remaining 1,290 SNP alleles that distinguish FES1 
and FES2 are supposed to have accumulated at or after establishment 
in 2005. Regarding these SNPs, STAP cell lines FLS3 and CTS1 and 
129/GFP ES cells are nearly identical, but differ slightly from FES1 (at 
30% of these alleles), suggesting that STAP cell lines FLS and CTS 
were derived from a sub-stock of FES1 ES cells. 

The STAP stem-cell line GLS1-13 was reported as established from 
STAP cells prepared from genomic Oct4 fragments (GOF) mice (B6 
background) carrying the Oct4-gfp transgene’® in 2012. All these cell 
lines have a large truncation with a terminal inverted repeat in one of 
two X chromosomes (Extended Data Fig. 2a). An identical X chro- 
mosome was found in GOF-ES, an ntES cell line established from 
GOF mice in 2011, but not in parental GOF mice. It is unlikely that 
such a peculiar X chromosome abnormality would occur indepen- 
dently, strongly suggesting that the GLS lines were derived from the 
GOF-ES. 

SNP analysis revealed that two independent STAP stem-cell lines, 
AC129-1 and AC129-2, had a 129B6F1 genetic background, while 
they were documented in the original article’ as being established 
from 129 cag-GFP mice. We identified five heterozygous genomic 
anomalies: four deletions, and a duplication in these STAP stem-cell 
lines (Extended Data Fig. 2b, d), which were not found in the 
sequenced parental mouse genomes. We identified that these anom- 
alies and sexual identity were shared by one of six control ES cell lines 
with cag-gfp, 129B6F1 ES1, established earlier than AC129. This is 
also the case for the other cag-GFP STAP stem-cell lines, FLS-T1 and 
T2, established in 2013. The 129B6F1 ES1 also shares a characteristic 
homozygous B6-SNP cluster in chromosome 6 with these four cag- 
GFP STAP stem-cell lines (Extended Data Fig. 2c, d). It is unlikely that 
the 129B6 ES1 line and these cag-GFP STAP stem-cell lines indepen- 
dently inherited all five chromosomal anomalies, the Y chromosome, 
and the same chromosome 6 from parental mice at establishment. 

The article’ describes 2N chimaeric mice generated from STAP 
cells bearing cag-gfp on the 129B6F1 background and their germ-line 
transmission (Fig. 4 and Extended Data Fig. 7 in ref. 1) as evidence for 
pluripotency. We found nine genomic DNA samples for the offspring 
of STAP cell chimaeric mice (Extended Data Fig. 7c in ref. 1). These 
contained not only the Acr-gfp insertion but also the two deletions 
unique to FES1-derived Acr/cag-GFP cell lines described above 
(Fig. 1c) indicating that the cells transmitted to the germ line in the 
chimaeric mice were derived from FES! ES cells. 

The article’ also describes teratomas derived from Oct4-GFP STAP 
cells as evidence for pluripotency (Fig. 2 and Extended Data Fig. 4 in 
ref. 1). We found a glass slide specimen from which all these teratoma 
images were taken, and its corresponding paraffin block. Quantitative 
PCR of genomic DNA extracted from this paraffin block reproducibly 
indicated that these teratoma tissues formed from FES1-derived cells 
(Fig. 1d, e). Immunostaining revealed that intestinal epithelium tissue 
(Fig. 2e, right in ref. 1) and pancreatic tissue (Extended Data Fig. 4c in 
ref. 1), shown as teratomas from STAP cells’, were GFP-negative and, 
thus, of host mouse origin (Fig. 1f-h). 


Control genomic DNA sequences for STAP cell chromatin immu- 
noprecipitation sequencing (ChIP-seq) experiments (Fig. 4 in ref. 2) 
had been deposited in the NCBI database’. To gain sufficient sequen- 
cing coverage, we re-sequenced the genomic DNA prepared from the 
STAP cell lysate used for ChIP-seq (Extended Data Fig. 1a). We con- 
firmed that this STAP cell sample shared all the genomic character- 
istics described above for 129B6F1 ES1 (Extended Data Fig. 2c), 
indicating that the STAP cell sample used for ChIP-seq was derived 
from 129B6F1 ES1 cells. 

In summary, our investigations based on WGS of STAP-cell- 
related materials reveal that all of these materials are derived from 
previously established ES cell lines and refute the evidence shown in 
the two Nature papers’ that cellular stress can reprogram differen- 
tiated cells into pluripotent cells. Data described here were presented 
to an external investigative committee convened by RIKEN. Raw data 
are available at the DDBJ sequence read archive (DRA) under acces- 
sion number DRA002862. 
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Extended Data Figure 1 | Genome-wide SNP patterns of STAP-cell-related 
cells and mice. a, SNP patterns of STAP stem cells, Fgf4-induced stem cells and 
related ES cells as revealed by WGS. Chromosomes 1-19 and X are aligned 
from left to right. All cell lines and mouse strains except for STAP stem 

cell GLS1, GOF-ES and GOF-mouse are male. 129/GFP ES cells and the 
re-sequenced control DNA of STAP cells for ChIP-seq (Fig. 4 in ref. 2) are also 
shown. B6-homozygous, B6/129-heterozygous and 129-homozygous SNPs 
are shown in magenta, green and blue, respectively. Note that ntESG1 and 
ntESG2 inherited the B6-type X chromosome from maternal mice. Genomic 
regions in which FES1 and FES2 ES cells have different SNP clusters in 


 B6/B6 HH 86/129 MM 129/129 


chromosomes (chromosomes 6, 11 and 12) are marked by red rectangles. 

See b, c and Fig. 1b for a high-resolution map. SNP resolution is 10 Mb. 

b, c, High-resolution view of chromosomes 6 (b) and 11 (c), which show 
differential SNP clusters (red rectangles) between FES1 and FES2. In 

these regions, all SNPs are 129-type in FES1, and B6-type in FES2 on one of the 
homologous chromosomes. Therefore, in these regions, 129/B6 SNPs (green) 
and 129/129 SNPs (blue) in FES1 correspond to B6/B6 SNPs (red) and 
B6/129s (green) in FES2. STAP stem-cell line FLS3, Fgf4-induced stem-cell line 
CTS1, and 129/GFP ES cells share the same SNP patterns with FES1. SNP 
resolution is 1 Mb. 
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Extended Data Figure 2 | STAP stem cells derived from GOF-ES cells and 
cag-GFP ES cells. a, Copy number of chromosome X in STAP stem cells GLS1 
and GOF-ES cells, both of which have Oct4-gfp transgenes with a B6 
background. These lines have one very short X chromosome of ~23 Mb witha 
terminal inverted repeat and a normal X chromosome. b, PCR detection of 
chromosomal anomalies and Y chromosome in cag-GFP STAP stem-cell lines 
and parental mouse strains. Lanes 1-6: control ES cells, 129B6F1 ES1-6; 7: 
STAP stem cell AC129-1; 8: STAP stem cell AC129-2; 9: STAP stem cell 
FLS-T1; 10: STAP stem cell FLS-T2; and 11: GOF-ES. Deletions 1-4 and 
duplication 1 are located on Chr19: 32,857093-32866,121, Chr1:140,698,249- 
140,702,693, Chr4:123,747,239- 123,763,596, Chr10:43,265,147-43,267,270 


and Chr1: 180,730,393-180,732,937, respectively. c, Distribution of B6-type 
and 129-type SNPs along chromosome 6. The B6-homozygous SNP cluster 
(magenta) in the middle, which probably arose from the inheritance of 

the parental 129, is heterogeneous in length among six control ES cell lines. The 
four cag-GFP STAP stem-cell lines share the same length of the B6 SNP cluster 
with control ES 129B6F1 ES1. Note that the 129/B6-heterozygous SNP 
region in the 129 cag-GFP mouse is longer than that of AC129-1. d, Table 
summarizing the chromosomal anomalies and differential types of Chr6 B6- 
homozogous SNP clusters in the cag-GFP cell lines and parental mice. Control 
ES cell 129B6F1 ES1 shares all the characteristic features with the four cag-GFP 
STAP stem-cell lines. 
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Extended Data Table 1 | STAP-cell-related cell lines 


FLS1-8 STAP stem cell Acr/cag-gfp hetero 129X1SLCQ/ 2012 1/31-2/2 FES1 ES Article Fig. Sc, j-l, 


(FLS3) (cag-gfp homo) || B6N SLC 6 eFig. 8d,i,j 


Letter Fig. 2f, g, i, 
2012 5/25, 7/9 FES1 ES Fig. 4, eFig. 2a, b, 
eFi.g. 3 


GLS-1-13 Article Fig. 5 a-c, 
GLS-1) STAP stem cell Oct4-gfp 2012 1/31 GOF-ES eFig. 8k 
Letter Fig. 4 
129X1SLC9/ ; 
AC129-1, 2 Article Methods 
> ss 2 
(AC129-1) STAP stem cell cag-gfp homo B6N SLC 3 2012 8/13 129B6F1 ES1 Letter Fig. 2i, 4 
(129X1SLC) § 
FLS-Tl, T2# STAP stem cell cag-gfp homo 129X1SLC9/ 2013 2/22 129B6F1 ES1 
BON SLCG 
129B6F1 ES1-6 a. 129X1SLC9/ 5 
(129B6F1 ES6) * fertilized egg ES cell GEN SLC 2012 4/19 
129/GEP ES ** ES cell Acr/eag-gfp hetero 129X1SLC@/ Unknown 
BON SLC 
GOF-ES ntES cell 2011 5/26-10/31 = 
FES] + fertilized egg ES cell | Acr/cag-gfp hetero oe 2005 12/7 S- 


CTS-1, 11-13 
(CTS1) 


Acr/cag-gfp hetero 129X1SLC9/ 


Fgf4-induced stem cell (cag-gfp homo) || B6N SLC 3 


Article Fig. 8d,1,j 
Letter Fig. 2, Fig. 4, 
eFig. 3,4,6 


2 - 129X1SLCQ/ 

FES2 fertilized ES cell | Acr/cag-gfp hetero 2005 12/7 
_ — “m pe 
ntESG1 + ntES cell Acr/cag-gfp hetero DON, SLC9/ 2007 8/3 

129°Ter CLEAG 
ntESG2 7 ntES cell Acr/cag-gfp hetero ON SLC9/ 2005 1/20 
129°Ter CLEAS 


*The line subjected to WGS is indicated in parentheses in cases in which several sublines were established for one cell type. Other sublines were confirmed by PCR and sequencing. 

+The type of gfp transgene described in the original Nature papers'* is shown in parentheses when it is different from the one determined by WGS. 

£The parental mouse strains and their genetic background heterogeneity were identified using the TaqMan PCR system that discriminated mouse substrains and breeders. Parental mouse genotypes are not 
homogeneous as judged by their SNPs. 129X1SLC denotes 129X1/SvJJmsSlc; B6N SLC denotes C57BL/6NCrSlc; B6 denotes C57BL/6; 129+Ter CLEA denotes 129+Ter/SvJcl. 

§Cultivation start date. For GOF-ES this is the date of cell line establishment, and for FES1, FES2, ntESG1 and ntESG2 this is the date when frozen stocks were created. Information was obtained from T. Wakayama 
and his laboratory members. 

||Described as ‘cag-gfp (homozygous)’ by the author who established the cell lines. 

Described as ‘129 cag-gfp (homozygous)’ by the author who prepared mice and established those cell lines. 

#Obtained from T. Wakayama. WGS of FLS-T1 and FLS-T2 was not performed. PCR and Sanger sequence analyses showed that these two lines shared all characteristic genomic anomalies with AC129. 
uxThe genome of 129B6F1 ES6 was sequenced, but further analysis showed that 129B6F1 ES1 rather than 129B6F1 ES6 shares all genomic anomalies found in AC129-1 (see main text). 

**129/GFP ES corresponds to Acr/cag-GFP cells found in the laboratory of the authors, and has identical genomic structures, including deletions and fine SNP patterns, to those of FLS1 and CTS1. 
++Obtained from H. Ohta. Full names: FES1: 129B6GFP1 FES male; FES2: 129B6GFP2 FES male; ntESG1: 129B6F1G1; ntESG2: 129B6F1G2. 
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Failure to replicate the STAP cell phenomenon 


ARISING FROM H. Obokata et a/. Nature 505, 641-647 (2014) doi:10.1038/nature12968; retraction 511, 112 (2014) doi:10.1038/nature13598; 
and H. Obokata et a/. Nature 505, 676-680 (2014) doi:10.1038/nature12969; retraction 511, 112 (2014) doi:10.1038/nature13599 


Although the reports that stress (such as exposure to acid) can 
coax somatic cells into a novel state of pluripotency’* have been 
retracted*, the validity of stimulus-triggered acquisition of pluripo- 
tency (STAP) remains unclear (http://dx.doi.org/10.1038/protex. 
2014.008 and Supplementary Information). Here we describe the 
efforts of seven laboratories to replicate STAP, including experi- 
ments performed within the laboratory where STAP first originated, 
as well as re-analysis of the sequencing data from the STAP 
reports. Neonatal cells treated with two STAP protocols exhibited 
artefactual autofluoresence rather than bona fide reactivation of an 
Oct4 (also known as Pou5f1) and green fluorescent protein (GFP) 
transgene reporter, did not reactivate pluripotency markers towards 
embryonic stem (ES)-cell-like levels, and failed to generate terato- 
mas or chimaerize blastocysts. Re-analysis of the original RNA 
sequencing (RNA-seq) and chromatin immunoprecipitation 
sequencing (ChIP-seq) data identified discrepancies in the sex and 
genetic composition of parental donor cells and converted stem 
cells, and revealed a STAP-derived cell line to be a mixture contain- 
ing trophoblast stem cells, attesting to the importance of validating 
the properties and provenance of pluripotent stem cells using a wide 
range of criteria. 

To assess the reprogramming capacity of STAP protocols, we used 
a transgenic Oct4-GFP reporter, which shows GFP reactivation dur- 
ing Oct4/Sox2/KIf4 reprogramming, in established induced pluripo- 
tent stem (iPS) cells and in the gonads of mid-gestation ‘all iPS cell’ 
embryos generated by tetraploid complementation*’ (Extended Data 
Figs 1 and 2a). Working within the Vacanti laboratory where the 
concept of STAP cells originated, and assisted by a co-author of the 
STAP papers, a Daley laboratory member (A.D.L.A.) attempted to 
replicate two reported STAP protocols: (1) mechanical trituration and 
acid treatment of mouse lung cells (Brigham and Women’s Hospital 
(BWH) protocol; see Supplementary Information), and (2) acid treat- 
ment of mouse splenocytes (RIKEN protocol; Methods and Extended 
Data Fig. 2b). Seventy-two hours after stress treatment of lung cells, 
floating spheres appeared amidst cellular debris. Fluorescence micro- 
scopy revealed that both Oct4-GFP and wild-type spheres emitted low- 
level broad spectrum fluorescence detectable within both green and red 
filters, indicating autofluorescence (Fig. 1a). Untreated Oct4-GFP ES 
cells did not emit the same low-level broad spectrum fluorescence as 
STAP-treated cells. STAP-treated splenocytes formed spheres with 
lower efficiency, but also appeared autofluorescent. 

Flow cytometry indicated STAP-treated Oct4-GFP cells did not 
exhibit Oct4-GFP reactivation at levels comparable to control Oct4- 
GFP mouse ES cells, and were indistinguishable from stressed wild- 
type controls (Fig. 1b). Absence of ES-cell-like levels of Oct4, Sox2 
and Nanog transcripts and nonspecific immunofluorescence corro- 
borated flow cytometry data (Extended Data Fig. 2c, d). Rare plur- 
ipotent cells should generate teratomas in immunocompromised 
mice®’, but STAP cells could not, unlike control ES cells (Extended 
Data Fig. 2e, f). Replication of the poly-L-glycolic acid (PLGA)-based 
teratoma production method described in the original STAP reports 
with GFP cells to distinguish host and donor contribution produced 
distinct masses of connective tissue, muscle and scar, with minimal 
GFP content, indicating primarily host origin (Fig. lc, d and 
Extended Data Fig. 2g). Rare GFP-positive clusters did not form 
differentiated tissues characteristic of ES-cell-derived teratomas 
(Fig. 1d). Autofluorescent spheres failed to enter development after 
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morula aggregation or blastocyst injection (Fig. le and Extended 
Data Fig. 2h-j). Therefore, pluripotency was undetectable in STAP 
experiments. Six other laboratories (Deng, Hanna, Hochedlinger, 
Jaenisch, Pei and Wernig) also attempted to generate STAP cells 
(Table 1) and made the following observations. First, autofluorescent 
sphere-like aggregates after STAP treatment were universally seen. 
Second, transgenic reporters used by Obokata and colleagues 
(GOF18-Oct4-GFP, containing the 18-kilobase genomic Oct4 frag- 
ment (GOF18)) and by the Daley, Pei and Hanna laboratories 
(GOF18-Oct4APE-GFP, lacking the Oct4 proximal enhancer (PE) 
element) both exhibit activity in pre-implantation embryos, early 
post-implantation epiblast cells (embryonic day (E) 5.5), germ 
cells, and mouse ES/iPS cells; however, differential activity in late 
post-implantation epiblast (E6.5) and early passage mouse epi- 
blast-derived stem cells has been ascribed to the Oct4 proximal 
enhancer’®*. Using the same reporter as Obokata and colleagues’”, 
the Deng laboratory observed that the GFP signal in chemical iPS cells 
was easily distinguishable from the autofluorescence of STAP-treated 
cells (Extended Data Fig. 2k). The Jaenisch, Wernig and Hochedlinger 
laboratories failed to observe GFP reactivation with Oct4 or Nanog 
knock-in reporters, excluding a scenario of uncoupling between GFP 
and endogenous pluripotency expression’. Despite a range of tested 
reporters, no group documented authentic Oct4/Nanog reporter 
activation that resembled bona fide ES cells. Third, the Deng laboratory 
failed to observe Oct4, Sox2 and Nanog induction 3 and 7 days after 
STAP treatment, reducing the likelihood that pluripotency was transi- 
ently activated and silenced by day 7 (Extended Data Fig. 21). Finally, 
the Hanna, Wernig and Hochedlinger laboratories failed to generate 
stem-cell lines by culturing STAP-treated cells in leukaemia inhibitory 
factor (LIF) and adrenocorticotropic hormone (ACTH)-supplemented 
medium. In summary, 133 replicate attempts failed to document gen- 
eration of ES-cell-like cells, corroborating and extending a recent 
report’. 

We re-examined the high-throughput sequencing data from the 
STAP reports to investigate the genetic provenance of parental 
CD45* cells and converted STAP cells, STAP stem cells and 
Fgf4-induced stem cells (FI-SCs) (Fig. 1f). Comparative genomic 
hybridization array data mentioned in the original paper’ were 
not publicly released. Copy number variation (CNV) analysis con- 
ducted using ChIP-seq input samples revealed a discrepancy in sex 
across samples as well as chromosomal aberrations (Fig. 1g). In the 
original STAP reports, the authors stated that they mixed CD45~* 
cells from male and female mice owing to the small number of 
CD45* cells retrieved from individual neonatal spleens. However, 
our analysis indicates that CD45* cells were female, whereas the 
derived cells (STAP cells, STAP stem cells and FI-SCs) were all 
male, a clear inconsistency. We note that control ES cells were also 
male (Fig. 1g). FI-SCs possessed trisomy 8, which renders mouse ES 
cells germline-incompetent™ (Fig. 1g). 

Inferred single nucleotide variants (SNVs) from RNA-seq data 
allowed classification of samples as genetically similar or dissimilar 
(Fig. 1h). Control ES cells, parental donor female CD45" cells, STAP 
cells, and STAP stem cells all possessed similar SNV profiles, consist- 
ent with their derivation from a first generation hybrid of C57BL6/129 
strains, the reported genotype (Fig. 1h and Extended Data Fig. 3). By 
contrast, FI-SCs had an SNV profile that matched a single nucleotide 
polymorphism (SNP) profile of C57BL6 strain origin, indicating 
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Figure 1 | Characterization of the STAP cell phenomenon. See 
Supplementary Information for further details. a, STAP treatment produces 
fluorescent signal detected in both FITC (green) and TRITC (red) channels 
in STAP-treated Oct4-GFP and wild-type (WT) cells, consistent with 
autofluorescence. TRITC signal was not detected in control Oct4-GFP mouse 
ES cells. Note saturation of the green signal in Oct4-GFP ES cells at the higher 
exposure time required to detect FITC from autofluorescent spheres. 

b, Absence of ES-cell-like Oct4-GFP reactivation. Representative flow 
cytometry results 7 days after STAP treatment of lung cells or splenocytes 
(BWH or RIKEN protocol, respectively) without singlet/doublet exclusion and 
live/dead-cell discrimination. GFP gates were calibrated based on control Oct4- 
GFP ES cells grown on feeders. Whereas control ES cells are bright and situated 
at approximately 1 X 10° (arbitrary units), no event resembling Oct4-GFP ES 
cells was detected after STAP treatment. One replicate per protocol is shown. 
iMEFs, irradiated MEFs. c, STAP-treated cells do not form teratomas using 
PLGA-based teratoma production methods’. Photograph of control mouse ES- 
cell-derived teratoma (top left) and non-teratoma STAP-PLGA mass (bottom 
left). Representative haematoxylin and eosin (H&E) stainings of a control 
mouse ES-cell-derived teratoma (top right) and the non-teratoma STAP-PLGA 
mass (bottom right). d, STAP-PLGA mixtures present no indication of ES-cell- 
like in vivo differentiation capacity after injection into immunocompromised 
mice. Note lack of organization into representative tissue structures typically 
observed in ES-cell-derived teratomas. DAPI, 4’,6-diamidino-2-phenylindole. 
e, STAP-treated lung cells fail to incorporate into preimplantation embryos 
after morula aggregation. f, Analysis of sequencing data. Samples are classified 
based on copy number and genotype. STAP cells, STAP stem cells (STAP-SCs) 


distinct genetic provenance from parental CD45* and STAP samples 
(Fig. 1h and Extended Data Fig. 3). Independently sourced control 
epiblast stem cells and trophoblast stem cells (TSCs) had SNV profiles 
divergent from the CD45* and STAP sample cohort, as expected 
(Fig. 1h). An anomalous allele frequency distribution observed in 


and ES cells share similar characteristics for genotype and copy number of 
chromosome X. g, Copy number (CN) profiles, reported as a log, ratio 
(observed to expected read counts), derived using ChIP-seq input data. Red/ 
green correspond to significant amplifications and deletions (log,(CN) = 0.2 
or = —0.2 and P=0.01), respectively. Grey denotes non-significant variants. 
Note the amplifications of chromosomes 8 (FI-SCs) and 6/11 (TSCs) and the 
single copy of chromosome X in STAP cells, STAP-SCs, FI-SCs and ES cells. 
h, SNVs inferred from RNA-seq data using the mouse reference genome 
(derived from C57BL6 strain). The selected SNVs are classified as homozygous 
for reference allele (0/0 genotype), homozygous for alternative allele (1/1 
genotype) or heterozygous (0/1 genotype). Samples are clustered based on the 
sum of edit distance between each SNV. Note that each pair of replicates is 
always grouped together. A subset of samples (CD45*, STAP, STAP-SCs and 
ES cells) shows prevalence of heterozygous alleles (A); FI-SC samples have 
prevalence of homozygous alleles for the reference variant (B); and, TSC and 
epiblast stem cell (EpiSC) samples have a larger number of homozygous 
alternative alleles (C). i, Contamination in the FI-SC samples with TSCs. The 
expected frequency of reads covering the alternative allele for heterozygous 
SNVs is ~50%, which is observed in all samples including TSCs (left). In FI- 
SCs, it was ~12% (Extended Data Fig. 3), suggesting false-positive calls or 
contamination. The alternative allele frequency distributions of TSC 
homozygous and heterozygous SNVs sets in FI-SCs (right) show peaks at 

9% and 4%, respectively. These results indicate that FI-SC samples are 
approximately 10% contaminated by TSC samples. Original magnifications, 
X20 (a, d, e) and X4 (c). 


FI-SCs, and reciprocal analyses of FI-SC heterozygous SNVs in 
TSCs and TSC homozygous and heterozygous SNVs in FI-SCs, 
revealed that FI-SCs were derived from a C57BL6 strain origin, with 
approximately 10% contamination from TSCs (Fig. li and Extended 
Data Fig. 3). These are concordant with the findings from a recent 
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Table 1| Global efforts to replicate STAP cell generation 


Laboratory Mouse strain Starting cell types Pluripotency reporter STAP generation Pluripotency Experimental Conclusion 
protocol assessment assay replicates 

Jaenisch (MIT, C57BL6/129; MEFs; neonatal spleen; Oct4-eGFP knock-in Obokata article; Epifluroescence; FACS 17 egative 
USA) JAX 008214 neonatal adipocytes; (Jaenisch); Nanog-eGFP Obokata, Sasai & Niwa 

neonatal fibroblasts knock-in Jaenisch) protocol* 
Wernig (Stanford, C57BL6/129; MEFs Oct4-eGFP knock-in Obokata article’? Epifluorescence; 3 egative 
USA) JAX 008214 (Jaenisch) FACS; STAP-SC 
Pei (Guangzhou, C57BL6/CBA; Spleen MNCs, CD45* GOF18-Oct4APE-GFP Obokata article; Epifluorescence; FACS 11 egative 
China) JAX 004654 spleen cells, MEFs, pre-iPS transgene (Scholer/ Obokata, Sasai & Niwa 

cells, EpiSCs, mammary Mann) protocol* 

epithelial cells 
Deng (Peking, C57BL6 x ICR MEFs, neonatal GOF18-Oct4-GFP Obokata article; Epifluorescence; 37 Negative 
China) fibroblasts, neonatal bone transgene (Scholer; Obokata, Sasai & Niwa FACS; qPCR 

marrow, neonatal brain, identical to Obokata protocol* 

neonatal cardiac muscle article reporter!) 

cells, neonatal heart, 

neonatal lung, neonatal 

spleen, adult spleen 
Hanna C57BL6/CBA, NeonatalCD45* spleen GOF18-Oct4APE-GFP Obokata article’? Epifluorescence; 5 Negative 
(Weizmann, Israel) JAX 004654 cells, neonatal liver, transgene (Scholer/ FACS; STAP-SC 

neonatal brain Mann) 
Hochedlinger C57BL6/129, Neonatal CD45* Oct4-eGFP knock-in Obokata article’; Epifluorescence; 6 Negative 
(Harvard, USA) JAX 008214 splenocytes, E14.5 MEFs (Jaenisch) BWH protocolt; immunofluorescence; 

Obokata, Sasai & Niwa FACS; STAP-SC 
protocol* 

Daley (Harvard, C57BL6/CBA, MEFs, neonatal CD45* GOF18-Oct4APE-GFP Obokata article!; Epifluorescence; 54 Negative 
USA) JAX 004654 spleen cells, neonatal lung transgene (Scholer/ BWH protocolt; FACS; qPCR; 


cells, neonatal liver, 
neonatal heart, neonatal 
brain 


Mann) 


*Obokata, Sasai & Niwa protocol: http://dx.doi.org/10.1038/protex.2014.008 
BWH protocol: see Supplementary Information. 


he STAP papers? were published in Nature on 29 January 2014 (both retracted$* in July 2014). Below, we provide the time period in w 
jaboratory performed STAP replication experiments from February to April 2014. The Hanna laboratory performed STAP replication experiments from February to March 2014. The Wernig laboratory performed 
STAP replication experiments in February 2014. The Deng laboratory performed STAP replication experiments from February to July 2014. The Pei laboratory performed STAP replication experiments from 
February to May 2014. The Hochedlinger laboratory started STAP replication experiments on the day of STAP paper publication (29 January 2014) and continued to April 2014. The Daley laboratory performed 
STAP replication experiments and characterization of the STAP phenomenon from February to November 2014. 


Obokata, Sasai & Niwa 
protocol* 


immunofluorescence; 
teratoma; chimaera 


ich laboratories attempted to replicate STAP cell generation. The Jaenisch 


eGFP, enhanced GFP; EpiSCs, epiblast stem cells; MEFs, mouse embryonic fibroblasts; MNCs, mononuclear cells; qPCR, quantitative PCR; STAP-SC, STAP stem cells. 


RIKEN report (http://www3.riken.jp/stap/e/cl3document52.pdf). 
This contamination with TSCs explains the high-grade placenta- 
forming capacity reported for the Fl-SCs’, an unusual feature that 
implied totipotency, but which seems to have been due to admixture 
of cells. 

In summary, our replication attempts and genetic analysis indi- 
cate that existing STAP protocols are neither robust nor repro- 
ducible. To substantiate future claims of reprogramming and 
alternative states of potency, we urge a rigorous application of sev- 
eral independent means for validating functional pluripotency and 
genomic profiling to confirm cell line provenance. Ultimately, the 
essential standard of robustness and reproducibility must be met for 
new claims to exert a positive and lasting influence on the research 
community. 
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Extended Data Figure 1 | Validation of the Oct4-GFP transgenic reporter. 
a, Context-appropriate expression of the GOF18-Oct4APE-GFP transgene 
reporter in the testes of 10-day-old neonatal male mice. b, STAP replication 
culture reagents sustain Oct4-GFP signal in Oct4-GFP mouse ES cells. Vacanti 
laboratory LIF and B27 supplement sustain self-renewal and strong GFP signal 
of Oct4-GFP mouse ES cells in serum/LIF (left) and N2B27 minimal media (see 
Methods) plus 2i/LIF (MEK inhibitor PD0325901 and GSK3-B inhibitor 
CHIR99021) (right). c, Reactivation of the GOF18-Oct4APE-GFP reporter 
during direct reprogramming of MEFs by Oct4, Sox2 and KIf4. Left, phase- 
contrast images of founder GOF18-mouse iPS cells. Right, GFP signal in 
primary GOF18-mouse iPS cells. Note the heterogeneous reactivation of the 
GOF18-Oct4APE-GFP reporter in primary founder mouse iPS cell colonies 
(derived in knockout serum replacement/LIF). d, GOF18-Oct4APE-GFP 
reporter expression in established mouse iPS cell lines (passage 12). Left, phase- 
contrast images of established GOF18-mouse iPS cells. Mouse iPS cells were 


Oct4-GFP mESCs in Serum/LIF 


Established mouse iPS cells F 


Oct4-GFP mESCs in N2B27 + 2i/LIF 
GFP 


GFP 


GFP 


maintained on feeders in serum/LIF media. Right, note GFP signal in GOF18- 
mouse iPS cells. GFP is observed in essentially all iPS cell colonies and in 
most cells in each colony. GFP heterogeneity was slightly increased in 
GOF18-iPS cells compared with GOF18-ES cells. e, Developmental potential 
of GOF18-iPS cells. Top left, phase-contrast image of a teratoma generated 
from GOF18-iPS cells. Original magnification, x4. Top right, to assess the 
developmental potential of GOF18-iPS cells, “all iPS cell embryos’ were 
generated by injection of GOF18-iPS cells into 4N blastocysts (‘tetraploid 
complementation’). A photograph of a live E13.5 embryo generated from 
GOF18-iPS cells is shown. Bottom row, gonadal contribution in all-iPS-cell 
embryos indicates GOF18-iPS cells are highly pluripotent. GFP is expressed in 
E13.5 days post-coitum (dpc) male gonads, and fluorescent cords are visible. 
The silencing of GFP in surrounding cells re-confirms the context-appropriate 
expression of the Oct4-GFP reporter. 
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Extended Data Figure 2 | STAP replication data. a, Experimental scheme. 
Reactivation of the transgenic GOF18-Oct4APE-GFP (Oct4-GFP) reporter to 
detect reprogramming after STAP treatment of somatic cells. b, Two STAP 
protocol variants. BWH: mechanical trituration and low pH treatment of lung 
cells. RIKEN: low pH treatment of spleen cells. Stressed cells were plated 

into non-adhesive dishes and cultured in DMEM/F12 medium plus B27 and 
LIF. c, qPCR analysis 7 days after STAP induction. Expression levels of Oct4, 
Sox2 and Nanog transcripts at ES-cell-like levels were not observed in lung 
cells or splenocytes after treatment with the BWH or RIKEN STAP protocol, 
respectively. Levels normalized to Gapdh. One replicate per protocol is shown. 
Although low level Sox2 and Nanog upregulation (1-2 AC, cycles; data not 
shown) was inconsistently observed, we speculate that minimal induction of 
Sox2 and Nanog messenger RNA may be due to relaxed transcriptional 
control in stressed cells. d, Nonspecific staining observed in STAP-treated cells 
suggests immunofluorescence artefacts. ES cells and autofluorescent spheres 
(BWH protocol) were processed in parallel and stained with Oct4 and Nanog 
antibodies. In contrast to the specific nuclear signal observed in positive- 
control ES cells, nonspecific and non-nuclear staining is observed in spheres 
generated after STAP treatment. Original magnification, X20. e, Assessing the 
presence of rare ES-cell-like cells in STAP-treated cultures by teratoma 
formation assays. STAP-treated cells were transplanted subcutaneously or into 
the kidney capsule to detect rare ES-cell-like pluripotent cells. If ES-cell-like 
cells are generated after transient low pH treatment with/without mechanical 
trituration, a teratoma containing elements of all three germ layers should form. 
STAP-treated cells did not form teratomas using conventional teratoma 
generation protocols. Left two images, immunocompromised mice injected 
subcutaneously with STAP-treated cells, which do not exhibit teratoma-like 
mass formation after approximately 4 months of observation. Right two 
images, kidneys after STAP-treated cells were transplanted into the kidney 
capsule indicating lack of teratoma-like formation after 3 months of 
observation. Black arrows indicate kidney transplanted with STAP-treated 
cells; second kidney from same mouse not transplanted with STAP-treated 
cells. f, Immunocompromised NOD/SCID mice transplanted with STAP- 
treated cells did not form teratoma-like masses. Summary of teratoma injection 
experiments. Every assessable injection of mouse ES cells produced teratomas 
(7 out of 8 positive-control ES-cell-injected mice formed teratomas within 
3-4 weeks. The mouse that did not form a teratoma immediately died from 
surgical complications and therefore was discarded from the analysis). n = 8 
independent injection sessions; n = 21 injection sites. Therefore, STAP-treated 
cells did not form teratomas using conventional methods. g, Extended 
histological analysis of a recovered STAP-PLGA mass (as in Fig. 1c). Obokata 
and colleagues'” reported a distinct teratoma production method that involved 
seeding STAP-treated cells onto a PLGA scaffold before implantation into 


immunocompromised mice. Around 10-20 million STAP-treated cells from 
GFP-positive mice were seeded into PLGA. GFP-positive cells were used to 
distinguish donor- and host-derived tissues. Left, positive-control ES cells 
formed teratomas with tissue derivatives of all three germ layers. Left, original 
magnifications (from top to bottom): X40, X20, X40. Middle, recovered 
STAP-PLGA mass, H&E staining. Middle, original magnifications (from top to 
bottom): X20, X40, X40. Right, recovered STAP-PLGA mass, Masson’s 
staining (used to illustrate collagen deposition or presence of an inflammatory 
reaction, which commonly occur in response to foreign body implants). 
Right, original magnifications (from top to bottom): X20, x40, x60. All 
images were obtained from formalin-fixed/paraffin-embedded tissue sections. 
STAP-treated autofluorescent spheres failed to re-enter development after 
morula aggregation. Unlike ES or iPS cells, autofluorescent spheres failed to 
incorporate into the inner cell mass of the host embryos (n = 20), suggesting 
incompatibility with the pre-implantation embryo. i, STAP-treated 
autofluorescent spheres failed to re-enter development after blastocyst 
injection. Mechanically disaggregated autofluorescent spheres were injected 
into pre-implantation blastocysts and implanted into pseudopregnant mice. 
From 17 implanted embryos, only two were recovered, which were 
developmentally abnormal, suggesting that the other 15 embryos died or were 
resorbed. j, Contribution of STAP-treated lung cells to chimaeras was not 
detected after blastocyst injection. Images of two abnormal E10.5 embryos with 
no obvious GFP signal that would indicate integration of donor test cells 

into the developing host embryo. Original magnification, X10. 

k, Autofluorescence and Oct4-GFP fluorescence were distinguishable by 
fluorescence microscopy in cells containing the same Oct4-GFP transgenic 
reporter used by Obokata and colleagues!” (data from Deng laboratory). MEFs 
with the same transgenic Oct4-GFP reporter (GOF18-Oct4-GFP, intact PE) 
used in ref. 1 (passage 0) were treated with low pH solutions (pH 5.4 and 5.6, 
respectively). MEFs without low pH treatment were used as a negative control. 
After treatment, samples were cultured in suspension. Chemically induced 
pluripotent stem cells (CiPSC)** containing the transgenic Oct4-GFP reporter 
were used as a positive control for green fluorescence. GFP fluorescence was 
detected using a long-pass and band-pass filter. Red fluorescence was also 
observed in low-pH-treated MEFs, but not in CiPSC, as shown in the right 
column. Scale bar, 100 jum. I, ES-cell-like levels of Oct4, Sox2 and NanogmRNA 
(analysed by qPCR) were not observed 3 days after STAP treatment of 

MEFs (data from Deng laboratory). MEFs were treated with low pH solutions 
and cultured in suspension for 3 and 7 days (following the RIKEN STAP 
protocol) and analysed. R1 ES cells were used as a positive control. MEFs that 
were not subjected to the RIKEN STAP protocol but cultured in suspension 
medium were used as the negative control (—). 
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Extended Data Figure 3 | Re-analysis of published STAP RNA-seq data. 

a, SNVs inferred from RNA-seq (from Fig. 1h) were further filtered to select 
only the known SNPs across mouse strains based on the Sanger database 

(see Methods). We compared the SNV profile inferred from STAP RNA-seq 
data to the expected profiles (simulated based on the known SNPs) in different 
mouse strains (magenta) as well as simulated first-generation hybrids of 
C57BL/6NJ and each of the other strains considered (orange). Selected SNVs 
are classified as homozygous for reference allele (0/0 genotype), homozygous 
for alternative allele (1/1 genotype) or heterozygous (0/1 genotype) at each 
locus. Samples are clustered based on the sum of edit distance between each 
SNV using complete linkage hierarchical clustering. Note that replicates of the 
same experiment are always grouped together. A subset of samples (CD45", 
STAP, STAP stem cells and ES cells) (genotype A as in Fig. 1h) are clustered 
with simulated first-generation hybrids of C57BL/6NJ and 12981, in 
accordance with Obokata et al.” (LPJ strains have a profile similar to 129S1 for 
the selected SNVs). Whereas FI-SCs (genotype B as in Fig. 1h) are closer to 
the C57BL/6NJ strain (not the hybrid), EpiSC samples cluster with 129S1 or 
LPJ simulated SNVs profiles, both with some differences. Again the high 
similarity between 129S1 and LP) for these selected SNVs does not allow 


discriminating which of them is closer to EpiSC samples. Finally, TSC samples 
are clustered with other strains not mentioned by Obokata et al.’*. Overall, it 
is clear that TSC (as well as EpiSC) samples are derived from independent 
sources compared with STAP cells. b, Allele frequency distribution for SNVs 
shows number of reads for alternative alleles compared to the total number of 
reads for each SNV. The frequency of reads covering the alternative allele for 
heterozygous SNVs is expected to be approximately 50%, but in FI-SCs, it is 
nearly 12% (left, blue), suggesting false-positive calls or contamination (default 
thresholds in the variant calling algorithm result in incorrect classification of 
calls). We found that these FI-SC ‘heterozygous’ SNVs are predominantly 
homozygous for the alternative allele in TSCs (right, blue line), suggesting TSC 
samples as a contamination source in FI-SCs. The additional plots in Fig. 1i 
confirm that FI-SC samples are approximately 10% contaminated by TSC 
samples. c, Allele frequency distributions were independently calculated for 
all samples. As expected, the frequency of reads covering the alternative allele 
for heterozygous SNVs is approximately 50% (blue line) in all samples 
except FI-SCs (see b). In these plots, the first replicate (replicate 1) for each 
RNA-seq sample is reported; an almost identical profile is observed in each 
replicate pair. 
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FORUM Parkinson’s disease 
Crystals of a toxic core 


Anultra-high-resolution structure of the core segment of assembled a-synuclein — the protein that aggregates in the 
brains of patients with Parkinson’s disease — has been determined. A neurobiologist and a structural biologist discuss 
the implications of this advance. SEE ARTICLE P.486 


THE PAPER IN BRIEF 

@Asmall segment of a-synuclein is thought 
to form the core of the protein fibrils that 
are associated with Parkinson’s disease and 
other synucleinopathies. 

@ On page 486 of this issue, Rodriguez 

et al. used the sophisticated electron- 
diffraction technique MicroED 

to determine the structure of this 


Fibril features 
MICHEL GOEDERT 


he abnormal assembly of a-synuclein is 
central to Parkinson’s disease’. Fibrils 
formed from a-synuclein (Lewy pathology) 
are seen in some brain neurons of more than 
95% of patients with the disease, and their for- 
mation is associated with neurodegeneration’. 
Certain mutations in the a-synuclein gene, 
SNCA, and multiplications thereof, cause rare 
cases of Parkinson's disease. Sequence vari- 
ants in the gene's regulatory region are asso- 
ciated with increased disease risk. Moreover, 
overexpression of mutant human a-synuclein 
in animal models causes its aggregation and 
neurodegeneration. 
a-Synuclein is a 140-amino-acid protein 
that is abundant in nerve cells, where it is 
concentrated in nerve terminals. The protein 
binds to lipids through its amino-terminal 
half, which comprises seven imperfect repeat 
sequences. Upon lipid binding, a-synuclein 
takes on a partly a-helical structure. Under 
pathological conditions, it self-assembles into 
oligomers and fibrils. A seed of assembled 
a-synuclein can trigger aggregation of the 
soluble protein, and these insoluble aggre- 
gates slowly propagate through the brain. The 
long interval between the formation of the first 
protein inclusions and the appearance of dis- 
ease symptoms opens a therapeutic window, 
provided that sufficiently sensitive diagnostic 
techniques can be developed. 
Unbranched a-synuclein fibrils are 
5-10 nanometres in diameter and up to several 


tiny core segment of just 11 amino-acid 
residues. 

@ The 1.4-angstr6ém structure is the 
highest resolution yet achieved through 
cryo-electron-microscopy methods. 

@ The authors also present a MicroED 
structure of a segment of assembled 
a-synuclein that is mutated in some cases 
of familial Parkinson’s disease. 


micrometres long. They assemble from the 
full-length protein, but only approximately 
amino acids 30-100 make up the structured 
part. Mutations known to cause Parkinson's 
disease are located between residues 30 and 
53. Like other insoluble fibrous protein aggre- 
gates (known as amyloids), a-synuclein fibrils 
contain linked B-sheet structures. 

Rodriguez and colleagues present structures 
of an 11-amino-acid peptide corresponding to 
residues 68-78 of a-synuclein (Fig. 1), and of 
a peptide of residues 47-56 that contains the 
disease-causing mutation A53T. What evidence 

is there that residues 


The authors 68-78 form the core 
postulate that of a-synuclein fibrils? 
this mutated Previous work has 
regioninteracts shown that deletion 
with the fibril- of residues 71-82 
forming core abolishes the abil- 
to enhance ity of a-synuclein to 
aggregation. assemble into fibrils, 


propagate and be 
neurotoxic, and that a peptide of these amino 
acids will assemble into fibrils. Similarly, dele- 
tion of residues 66-74 abolishes assembly and 
this peptide can also form fibrils. Residues 
68-78, studied by Rodriguez et al., can assem- 
ble into fibrils too’, and the electron-diffrac- 
tion pattern produced by these assemblies 
resembles that of fibrils made from full-length 
a-synuclein. The authors show that fibrils of 
peptide 68-78 are toxic when externally applied 
to cells from tumour lines. However, aggregates 
of a-synuclein form inside cells in Parkinson's 
disease, so the relevance of this extracellular 
toxicity is not clear. 
Previous studies of fibrils assembled from 
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full-length a-synuclein have shown that 
residues 68-78 make up one of several 
B-strands. Although a complete description 
of the fibril will require the atomic structures 
of all 8-strands and the regions in between, 
Rodriguez and colleagues’ findings echo pre- 
vious atomic structures of amyloid-forming 
peptides’. The structures show paired B-sheets 
with parallel 6-strands in each sheet and 
antiparallel B-strands between sheets (Fig. 1). 
However, the zipper structure that marks the 
region between the paired sheets is longer than 
in other structures, and each pair of B-sheets 
contains two water molecules, instead of being 
dry. This new structural information may con- 
tribute to the development of molecules that 
can inhibit the formation of a-synuclein fibrils, 
as has been shown for the aggregates of tau 
proteins associated with Alzheimer’s disease’. 
The structure of the assembled mutated 
A53T a-synuclein peptide shows pairs of inter- 
digitating B-sheets, but with B-strands kinked 
at residue 51. The authors postulate that this 
mutated region interacts with the fibril-form- 
ing core to enhance aggregation. It remains to 
be seen whether this model can account for 
the effects of other mutations in this region 
that cause Parkinson’s disease. Furthermore, 
the fact that the mutations A30P and E46K, 
which also cause Parkinson’s disease, lie out- 
side the regions studied, suggests that further 
structural surprises may be in store. 


Michel Goedert is at the MRC Laboratory of 
Molecular Biology, Cambridge CB2 0QH, UK. 
e-mail: mg@mrc-lmb.cam.ac.uk 


Electron 
diffraction 


YIFAN CHENG 


he toxic cores of a-synuclein form well- 
ordered three-dimensional crystals 
(Fig. 1). But these crystals are so small that 
they are invisible by light microscopy, and are 
thus not amenable to structure determination 


A, JOSE A. RODRIGUEZ; B, MICHAEL R. SAWAYA 


NEWS & VIEWS | RESEARCH | 


Figure 1 | The core of a-synuclein fibrils. An 11-amino-acid segment of a-synuclein is thought to form the core of the protein fibrils that are seen in patients 
with Parkinson's disease. a, Crystals of this core are so small that they can be seen only by electron microscopy. (Scale bar, 600 nm.) b, Rodriguezand colleagues’ 
MicroED structure of the crystal’ reveals that each peptide forms a f-strand, and these strands pair and stack together to form B-sheets. The structure also reveals 
two water molecules (not shown) between the paired B-sheets. (Amino-acid residues 68-78 in each strand are labelled.) 


using conventional X-ray crystallographic 
techniques, or even X-ray free-electron laser 
(XFEL) technology. Now, Rodriguez et al. 
have solved the crystals’ structure using 
MicroED — a method based on cryo-electron 
microscopy (cryo-EM). 

Cryo-EM encompasses several techniques 
used in structural biology. Single-particle cryo- 
EM, which determines structures by averaging 
images of many individual molecules, has 
already produced high-resolution structures 
of molecules that were for decades beyond the 
reach of other crystallographic methods’. Elec- 
tron crystallography has also produced atomic 
structures from molecules that form 2D crys- 
tals. The development of MicroED, which uses 
electron diffraction to determine the structure 
of microscopic 3D crystals*, has added a new 
method to the cryo-EM repertoire. 

Electron diffraction is an established EM 
technique that has been used to determine the 
atomic structures of membrane proteins, such 
as water channels, that form well-ordered 2D 
crystals”. But for 3D crystals, the situation is 
more complicated. The primary concerns are 
dynamic (multiple) scattering, and an inabil- 
ity to index and merge diffraction patterns 
collected from crystals that vary in size and 
morphology. MicroED resolves the indexing 
problem by tilting the crystal in the electron 
microscope and collecting multiple diffrac- 
tion patterns from a single crystal, in much 
the same way as in X-ray crystallography. 
Continuous rotation of the crystals during 
data collection also attenuates the dynamic- 
scattering effect’. 

Rodriguez and colleagues’ structures are 
the first to be determined by MicroED from 
a molecule of previously unknown structure. 
They are also the highest-resolution structures 


determined using any cryo-EM technique. 
The study thus demonstrates the tremendous 
potential of MicroED for use in cases where 
other crystallographic methods cannot be 
used. The interaction between an electron 
beam and a specimen is much stronger than 
with X-rays, enabling the collection of high- 
quality diffraction data from tiny crystals. As 
predicted more than 15 years ago"’, the charge 
of an electron makes it relatively easy for elec- 
tron diffraction to visualize charged atoms in a 
structure, such as protons, which require high 
resolution to be resolved by X-ray crystallog- 
raphy. The instrumentation is readily available, 

in that an electron 


The structures microscope can be 
are the first to operated either as a 
be determined microscope to pro- 
by MicroED duce an image of the 
fromamolecule specimen, suchas for 
of previously single-particle cryo- 
unknown EM, or as a diffrac- 
structure. tometer to produce 


a diffraction pattern 
of the specimen. The data-collection and 
processing methods involved in MicroED are 
similar to those used in X-ray crystallography 
and, in comparison with XFELs, the instru- 
mentation cost is amazingly low and accessi- 
bility substantially greater. 

MicroED does have its limitations. The 
‘phase problem’ of crystallography — the 
fact that the phases of diffractions cannot be 
measured — can be particularly challenging 
in this approach. It is not easy to change the 
wavelength of an electron beam, so the tech- 
nique of using multi-wavelength anomalous 
diffraction to determine phase will probably 
not be applicable. And it is not clear whether 
isomorphous replacement, in which heavy 


metal atoms are inserted into the structure, will 
work for ab initio phasing, because dynamic 
scattering may reduce the diffraction signals 
generated by heavy metals. So far, all structures 
determined by MicroED used the molecular- 
replacement method for phase determina- 
tion, but it remains to be seen how the phasing 
problem will be resolved in future studies. 
There is probably also a constraint on the size 
of crystals that can be studied by MicroED. The 
strong scattering makes large crystals impen- 
etrable by electron beams, and merging dif- 
fraction patterns from different crystals may be 
difficult with crystals that have only a few unit 
cells in one direction. Nonetheless, MicroED 
provides another highly promising and com- 
plementary tool for structural biologists. = 
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Storms bring ocean 
nutrients to light 


Ships and ocean- observing robots have been used to quantify the amount of 
nutrients that a storm brings up from the Stygian ocean depths to the sunlit 
surface — a first step in assessing how storms affect oceanic biomass production. 


JAIME PALTER 


100 metres of the ocean’s surface, where 
available inorganic nutrients are rapidly 
assimilated by photosynthesizing plankton. As 
these cells die or are excreted in a consumer's 
waste, they sink into the dark ocean interior, 
where microbial decay returns their nutrients 
to the dissolved inorganic form. The physical 
processes that lift the resulting nutrient-rich 
water from the dark interior to the sunlit upper 
ocean are responsible for sustaining nearly all 
marine life. Writing in Global Biogeochemical 
Cycles, Rumyantseva et al.' quantify how a mid- 
latitude storm and its after-effects drive upward 
pulses of nutrients, thereby advancing our 
understanding of the processes that nourish 
plankton at the base of the marine food chain. 
Illuminating the processes that bring nutri- 
ents into the light has been a preoccupation 
of oceanographers for generations’, a pursuit 
made more urgent by predictions (see refs 3 
and 4, for example) that global warming may 
suppress the upward nutrient supply, with 
deleterious effects for many plankton. Such 
predictions are premised on a causal chain 
linking the warming of near-surface waters toa 


. met absorbs most sunlight within 


a Storm 


Wind 


High 
shear 


Figure 1 | Uplift of nutrients in the ocean during and after a 

storm. a, During a storm, the current in the upper ocean accelerates in 
response to the wind passing across it, whereas the interior ocean not far 
below is relatively unperturbed. This sets up a highly sheared current at the 
interface between the upper and the interior ocean, which is susceptible to 
the development of turbulence. Turbulent mixing (white arrows) thus 
brings nutrients from the interior to the surface. b, After the storm, the wind 


strengthening of the density gradient between 
the surface and interior ocean, which, in turn, 
increases the amount of energy required to 
bring subsurface nutrients upwards. However, 
the future of the physical drivers that provide 
this energy is rarely considered. 

The turbulence caused by stormy seas is 
thought to be one mechanism for mixing 
nutrients upwards. This turbulent mixing is 
rarely documented because of the challenges 
of conducting ship-based seawater sampling 
in high winds. Therein lies the novelty of 
Rumyantseva and co-authors’ study, which 
reports nutrient concentrations, the intensity 
of turbulent mixing and the velocities of ocean 
currents before, during and after the passage of 
a North Atlantic storm. They solved the sam- 
pling problem in part by using torpedo-shaped 
robots called gliders’, which continuously 
measured water-column properties regard- 
less of the weather. 

The authors report an approximately tenfold 
increase in surface nutrient concentrations 
during the storm, followed by two short-lived 
bursts of nutrients after the storm. They also 
observed that the concentration of chlorophyll, 
the pigment responsible for most photosyn- 
thesis, rose by about 50% near the surface 


b Inertial oscillation 


North Atlantic Ocean. 
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between the last day of the storm and several 
days later. 

The study provides insight into how storms 
can produce upward nutrient pulses, even 
after they cease. Wind blowing over the ocean 
during astorm sets its surface layer in motion, 
but the interior ocean not far below is relatively 
unperturbed. This creates a sharp change 
in the ocean-current velocity (that is, high 
shear) at the interface between the surface and 
the interior that leads to instabilities in the cir- 
culation and the development of turbulence 
(Fig. 1a). Such turbulence can mix nutrients 
into the sunlit region, and, in the case of the 
observed storm, was responsible for the rapid 
increase in surface nutrient concentrations. 

After storm winds relent, the surface waters 
oscillate like a pendulum, rocking back and 
forth while the planet rotates underneath. The 
circular path traced by the water relative to a 
fixed point on Earth is called an inertial oscil- 
lation. The shear spikes again when the wind 
aligns with the ocean surface current (Fig. 1b) 
— which occurred approximately twice a day 
for a steady wind over the inertial oscillation 
observed by Rumyantseva and colleagues. The 
authors report that these periods of alignment 
coincided with an upward delivery of nutri- 
ents at a rate 25 times faster than background. 
However, given that these bursts of intense 
turbulent mixing lasted only about an hour 
each, their total contribution to the nutrient 
supply was smaller than that delivered during 
the storm. 

The authors suggest that the net effect of 
non-winter storms might locally contribute 
up to 30% as much nutrient as is supplied by 
wintertime convective mixing, the mechanism 
that sustains the annual algal blooming of the 
North Atlantic. But measurements of a single 
event cannot resolve how storms influence 
biology on the scale of ocean basins. A study* 


gradually weakens, and the upper ocean rocks back and forth while 

the planet rotates underneath. The current therefore traces a circular path 
relative to a fixed point on Earth — this is known as an inertial oscillation. 
When the wind and upper-ocean current align, strong turbulent mixing 
again brings nutrients to the surface. Rumyantseva et al.' have measured 

the amount of nutrient brought to the surface during and after a storm in the 


in the subtropical North Atlantic showed that 
cyclones drive only a minor increase in the 
mean amount of chlorophyll during the hur- 
ricane season, because their direct influence is 
felt by a small fraction of the region. This inter- 
pretation was recently questioned, however, 
by another study’ that found an association 
between year-to-year variability of chlorophyll 
concentration and total cyclone energy dur- 
ing the hurricane season. It therefore remains 
unclear whether storms are major drivers of 
such variability on large scales. 

Moreover, neither the current paper nor 
the subtropical studies®’ extended conclu- 
sions from chlorophyll concentrations to rates 
of biological productivity — which is a more 
relevant metric for understanding the transfer 
of energy up the food chain, but more chal- 
lenging to measure. There is thus still much to 
debate about the magnitude of the large-scale 
biological response to storms. 

Future changes in nutrient supply to the 
ocean surface will depend on both the energy 


CARDIAC BIOLOGY 


required to bring nutrients to the surface and 
that available from winds and other physical 
forcing*. Climate models widely agree that 
future warming will strengthen the ocean's 
vertical density gradient and increase the energy 
required to mix nutrients upwards”. The future 
of storms is less clear. Climate models gener- 
ally predict a poleward shift of storm tracks and 
intensified storms in the Southern Hemisphere, 
whereas corresponding predictions for the 
Northern Hemisphere — and particularly the 
North Atlantic — are highly uncertain’. 
Attempts to predict the biological influ- 
ence of storms are further hampered by the 
inadequate representation of inertial oscilla- 
tions in climate models, a gap that research- 
ers are now working to fill’®. Such improved 
numerical representations of how storms 
influence ocean mixing, anda stronger handle 
on the large-scale influence of storms on pre- 
sent-day biology, will give us a clearer vision of 
what the future may hold for marine plankton 
and the organisms that depend on them. = 


A protein for healing 
infarcted hearts 


Human heart tissue has minimal ability to regenerate following injury. But the 
protein Fstll, which is normally expressed in the heart’s epicardial region, has now 
been shown to induce regeneration following heart attack. SEE ARTICLE P.479 


GORDANA VUNJAK-NOVAKOVIC 


ealthy mammalian heart tissue has 
H: measurable but limited ability to 
regenerate’. Over a normal human 
lifespan, around 45% of heart-muscle cells 
(cardiomyocytes) are renewed, with the 
remaining 55% persisting from birth. This rate 
is not sufficient to repair the injury caused by 
myocardial infarction, or heart attack as it is 
commonly known. Instead, the infarcted area 
becomes populated by fibroblast cells, which 
form a non-contractile collagenous scar — 
a quick fix that progressively decreases the 
heart’s pumping capacity. Any regenerative 
therapy must thus provide an influx of cells 
that can properly heal the muscle, either from 
an external source or from the body itself”. A 
major effort** is going into the potential use 
of immature cardiomyocytes derived from 
human stem cells for such regeneration. But 
in this issue, Wei et al.” (page 479) take a dif- 
ferent approach, making use of a protein that is 
present in the epicardial region of a healthy 
heart but lost following heart infarction. 
Previous work documented that the 
protein follistatin-like 1 (Fstl1) is involved 


in the development of many organ systems, 
by binding to proteins of the transforming 
growth factor-6 (TGF-B) family and inhibiting 
their functions’. Fstl1 is also involved in a spec- 
trum of diseases, from heart attacks to arthritis, 
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lung fibrosis and cancer, through its activation 
of multiple signalling pathways and inflam- 
matory and immune responses. Depending 
on the organ system, Fstll can act as a pro- 
inflammatory molecule’ or a cell-protective 
factor®, or can induce immune responses’. 
Fstl1 is also known as a modulator of cardiac 
development’ and as a marker of heart ischae- 
mia (restricted blood supply), hypertrophy 
(abnormal enlargement) and end-stage heart 
failure®. The many roles of Fstl1 are seemingly 
contradictory: it protects cardiomyocytes 
from apoptotic cell death and hypertrophy 
by mobilizing signalling through the phos- 
phorylated kinase enzyme AMPK, but it sup- 
presses their differentiation from stem cells 
by inhibiting signalling of the TGF- family 
member BMP. Notably, the presence of Fstl1 
in the heart correlates with reduced infarct size 
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Figure 1 | Patched up to heal. Myocardial infarction (a heart attack) results in a massive loss of heart- 
muscle cells (cardiomyocytes), which need to be regenerated for the injured site to heal in a way that 
restores pumping. Wei et al.° show that the protein Fstl1 is typically expressed in the epicardial layer 
surrounding the myocardium (muscle), and that this protein can induce cardiomyocyte proliferation — 
but also that this cardiogenic activity is lost when the expression shifts to the myocardium following 
infarction. However, they show that applying a collagen patch containing epicardial Fstl1 to a mouse 
heart immediately after infarction can reconstitute the cardiogenic activity of the protein and regenerate 


the heart muscle. 
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and with functional recovery, but this effect 
has been ascribed to enhanced reformation 
of blood vessels (revascularization) and 
cell survival, rather than the formation of 
cardiomyocytes”. 

Wei et al. now provide new and counter- 
intuitive insights into the biological functions 
of Fstl1. Their study shows that, in the healthy 
heart, the protein is expressed in the epicar- 
dium, the membranous layer surrounding the 
myocardium, throughout development and in 
adult life. They also observed that heart infarc- 
tion causes Fstl1 expression to be transferred 
from the epicardium to the myocardium, and 
that this shift impairs the heart's regenerative 
ability. Remarkably, the study reveals that re- 
established expression of epicardial Fstll can 
regenerate the injured heart muscle. 

The investigators hypothesized that a patch 
releasing epicardial Fstll, when placed onto 
the heart infarct, would serve as a source of 
Fstl1 and stimulate proliferation of the resident 
cardiomyocytes (Fig. 1). To test this hypoth- 
esis, they loaded collagen patches either with 
medium containing Fstl1 in which epicardial 
cells had been cultured, or with human Fstl1 
purified from a bacterial protein-expression 
system, and sutured these to the hearts of 
mice that had undergone modelled myo- 
cardial infarction. Four weeks later, they 
observed more cardiomyocytes, higher tran- 
scription of cardiac marker genes anda greater 
frequency of calcium pulses (indicative of 
heart pumping) compared with infarcted 
hearts without patches. There was also less 
formation of fibrotic scar tissue and a better 
revascularization of the area. These findings 
suggest that such in situ manipulation might 
allow control of the fate of existing cardio- 
myocytes, to achieve heart regeneration with- 
out implanting cells. 

This study is an inspiring example of how 
a developmentally conserved regulatory 
pathway can be mobilized to induce heart 
regeneration. Although more work needs to 
be done to determine the benefits of such an 
approach in large-animal models (the authors 
conducted a preliminary study in pigs, but it 
involved only six animals divided into three 
groups), the proposed reconstitution of epicar- 
dial Fstl1 could lead to entirely new modalities 
for treating heart infarction. The study also 
leaves us with questions about the biologi- 
cal phenomena responsible for the observed 
effects. Fstl1 is still an enigmatic protein with 
largely unknown properties, but with seem- 
ingly huge potential for diagnosing and treating 
heart disease. 

One intriguing question is why infarction- 
induced myocardial expression of Fstl1, or 
even experimentally induced overexpression 
of Fstl1 in the myocardium, cannot induce 
heart regeneration, but epicardial Fstl1 applied 
on the patch can. The authors also find this 
result paradoxical. They suggest that differ- 
ent extents of glycosylation (the number of 


carbohydrate molecules — glycans — attached 
to the protein) that they measured for epicar- 
dial and myocardial Fstl1 reflect differences 
in glycan structure that affect the proteins’ 
function. It remains to be seen whether these 
differences are cell-of-origin specific, how 
important glycosylation is for regenerative 
ability, and what the necessary features would 
be for a patch that can induce regeneration in 
the human heart. 

Other questions arise from the combined 
observations that, although myocardial Fstl1 
does not induce cardiomyocyte generation, 
it does protect immature cardiomyocytes, 
whereas epicardial Fstl1 on a patch enhances 
cardiomyocyte proliferation, but is not cell- 
protective. Further investigation is needed 
to explore whether glycosylation is a key 
determinant of cardioprotective versus car- 
diogenic effects, as proposed by Wei and col- 
leagues. Finally, the study suggests that only 
very immature cardiomyocytes respond 
to Fstll. The genetic signatures and the 
origin of the responsive cells (whether they 
are resident or recruited) also remain to be 
determined. 

These questions are likely to motivate future 
studies. Exciting approaches are now emerging 
at the interface of stem-cell biology and tissue 
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engineering. High-fidelity models of human 
heart tissue, combined with findings such as 
these, could markedly advance quantitative 
biological research and the clinical transla- 
tion of discoveries into curative treatments 
for heart disease. m 


Gordana Vunjak-Novakovic is in the 
Departments of Biomedical Engineering and 
Medicine, Columbia University, New York, 
New York 10032, USA. 

e-mail: gv2131@columbia.edu 


1. Bergmann, O. et al. Science 324, 98-102 (2009). 
2. Laflamme, M. A. & Murry, C. E. Nature 473, 
326-335 (2011). 
3. Chong, J. J. et al. Nature 510, 273-277 (2014). 
4. Menasché, P. et al. Eur. Heart J. 36, 2011-2017 
(2015). 
. Wei, K. et al. Nature 525, 479-485 (2015). 
. Sylva, M., Moorman, A. F.M. & van den Hoff, M. J. B. 
Birth Defects Res. C 99, 61-69 (2013). 
7. Miyamae, T. et al. J. Immunol. 177, 4758-4762 
(2006). 
8. Ogura, Y. et al. Circulation 126, 1728-1738 
(2012). 
9. Mercola, M., Ruiz-Lozano, P. & Schneider, M. D. 
Genes Dev. 25, 299-309 (2011). 
10.van Wijk, B., Gunst, Q. D., Moorman, A. F. M. & 
van den Hoff, M. J. B. PLoS ONE 7, e44692 
(2012). 


ano 


This article was published online on 16 September 2015. 


Neutrons with a twist 


Neutrons do not normally have orbital angular momentum. But the demonstration 
that a beam of neutrons can acquire this property, 23 years after it was shown in 
photons, offers the promise of improved imaging technologies. SEE LETTER P.504 


ROBERT W. BOYD 


eutrons were discovered in 1932 

by the physicist James Chadwick, 

and the particles continue to amaze 
scientists to this day. It was initially thought 
that neutrons were elementary particles — that 
is, that they were not composed of other par- 
ticles. But we now know that, just like protons, 
neutrons are comprised of three elementary 
particles called quarks. Quarks have an intrinsic 
property known as spin angular momentum 
(or spin), and they endow the neutron with 
a spin that has a value of %h (where h is the 
reduced Planck constant). On page 504 of this 
issue, Clark et al.' show that a free neutron can 
have a different kind of angular momentum: 
orbital angular momentum (OAM). 

OAM is a broad concept in modern phys- 
ics, but is usually associated with the motion of 
electrons around the atomic nucleus in atoms 
and molecules. In contrast to spin, OAM is 
not an intrinsic property of the electron: it can 
take any value of an integer L multiplied by h, 
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whereas the electron’s spin has a fixed value of 
Yh. Electron spin and OAM are analogous to 
Earth's rotation on its axis and its orbit around 
the Sun, respectively. 

But OAM has also arisen in a different con- 
text: in the early 1990s, it was theoretically” 
and experimentally’ shown that any helically 
phased light beam can possess OAM. It has 
since been established that this is true even 
for a single photon’. This is therefore another 
source of angular momentum for the particle, 
in addition to its spin (which is associated with 
the circular polarization of light). It is a crucial 
property of photons that has found applica- 
tions in the field of photonics, such as the cod- 
ing of quantum‘ and classical’ information in 
individual photons, quantum-entanglement 
protocols® and the manipulation of small par- 
ticles by optical forces’. 

In 2010, electron beams with OAM were 
also generated, confirming that this property 
is not limited to light beams®. Many advances 
in the production and use of OAM-carrying 
electron beams have since been reported (see 
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Figure 1 | Orbital angular momentum of neutrons. Clark et al.' channelled a beam of neutrons through 
a device known asa spiral phase plate, which modified the neutrons original, planar wavefunctions and 
imparted orbital angular momentum to the particles. The wavefunction of the neutrons that emerge from 
the device has acquired an azimuthal phase distribution of the form e”* (where iis the imaginary unit, L is 
any integer and 9 is the azimuthal angle of the plate). This phase variation causes the helical structure seen 
in the emergent wavefunction, which is associated with the acquired orbital angular momentum. 


ref. 9 for a review). The fact that photons are 
not the only particles that can have OAM 
has opened up possibilities for fundamental 
studies of electromagnetic interactions and 
for applications such as improved electron 
microscopes. 

Clark and colleagues’ work adds neutrons 
to the list of particles that can have OAM. The 
authors generate OAM-carrying neutrons 
by guiding a beam of the particles through a 
device known as a spiral phase plate (Fig. 1). 
The thickness of this device varies uniformly as 
a function of the plate’s azimuthal angle, g (the 
angle measured around the circumference of 
the plate). The wavefunction ofa neutron pass- 
ing through this device acquires a phase shift 
that is proportional to the plate’s local thick- 
ness. For appropriate values of the variation 
of thickness with g, the wavefunction acquires 
an azimuthal phase distribution given by e”’, 
where L is any positive or negative integer and 
iis the ‘imaginary unit’ (the square root of —1). 

The authors fabricated several plates whose 
thickness distributions corresponded to vari- 
ous values of L, and thus generated neutron 
beams carrying OAM of different Lh values. 
Like its spin, a neutron’s OAM is a quantum- 
mechanical attribute. It occurs as a conse- 
quence of the helical structure of the particle's 
‘twisted’ wavefunction when it emerges from 
the plate. To verify that the neutron beam had 
acquired OAM as it passed through the plate, 
Clark et al. used a technique known as neutron 
interferometry. In this approach, the neutron 
wavefunction was split into two paths and a 
spiral phase plate was placed in one of them. 
The two paths were subsequently combined 
coherently to form an output beam whose 
interference pattern showed the azimuthal 
phase distribution that the wavefunction 
had acquired. 

Although Clark and colleagues’ results are 
impressive, they represent only the first step 
in an emerging field of research. For example, 


in the present experiment, the neutron beam 
falling on the spiral phase plate is a statisti- 
cal mixture of several OAM quantum states. 
Before applications can be developed, neutrons 
must be generated that have quantum states 
with definitive OAM values (eigenstates). In 
addition, holographic methods have been 
developed for creating optical’*”’ and elec- 
tron'’* OAM states, and these are more precise 
and versatile than the use of spiral phase plates. 
It will thus be interesting to explore the use 
of holographic techniques for neutrons too. 
The potential use of neutron OAM states for 
quantum-information studies is another excit- 
ing prospect. 

Finally, Clark and colleagues’ study opens up 
a further avenue for future work: the use of neu- 
tron beams with OAM for imaging. Because 
neutrons are penetrating particles, they could 
offer practical advantages compared with opti- 
cal and electron microscopy in deep-imaging 
studies of materials. One might therefore 
conclude that OAM-carrying neutron beams 
may boldly go where no quantum particle has 
gone before. m 
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50 Years Ago 


A Biological Retrospect. By Sir 
Peter Medawar — The title of my 
presidential address, as you will 
have discerned, is “A Biological 
Retrospect’, and on the whole 

it has not been well received. 

‘Why a biological retrospect?; I 
have been asked; would it not be 
more in keeping with the spirit 

of the occasion if I were to speak 
of the future of biology rather 

than of its past? Unfortunately, 

it is impossible to predict new 
ideas ... and we are caught ina 
logical paradox the moment we 
try to do so. For to predict an 

idea is to have an idea, and if we 
have an idea it can no longer be 
the subject of a prediction. Try 
completing the sentence ‘I predict 
that at the next meeting of the 
British Association someone will 
propound the following new theory 
of the relationships of elementary 
particles, namely..?. If 1 complete 
the sentence, the theory will not be 
new next year; if] fail, then I am not 
making a prediction. 

From Nature 25 September 1965 


100 Years Ago 


We have still ... very much to learn 
about causes in action; and the 
mystery of the earth, and of our 
connection with it, grows upon 

us as we learn. Can we at all realise 
the greatest change that ever came 
upon the globe, the moment when 
living matter appeared upon its 
surface ... And here was living 
matter, a product of the slime, 

if you will, but ofa slime more 
glorious than the stars. Was this 
thing, life, a surface-concentration, 
a specialisation, of something 

that had previously permeated all 
matter, but had remained powerless 
because it was infinitely diffuse? 
Here you will perceive that the 
mere geologist is very much beyond 
his depth. 

From Nature 23 September 1915 
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Infection elevates 


diversity 


Chromosomal shuffling in parental eggs or sperm can create new characteristics 
in the next generation. In fruit flies, it seems that mothers with a parasitic 
infection produce more such recombinant offspring than uninfected mothers. 


ANEIL F. AGRAWAL 


genetically distinct from their parents. 

Through the process of recombination, 
which occurs as sperm and egg cells (gametes) 
are produced, a parent can mix the two cop- 
ies of a given chromosome received from its 
own parents, thus transmitting unique chro- 
mosomes to its offspring. Writing in Science, 
Singh et al.’ show that fruit flies produce a 
higher frequency of offspring with recom- 
binant chromosomes when the mother is 
infected with a parasite than when it is unin- 
fected. This intriguing observation may be an 
important piece in the long-standing puzzle of 
why recombination is so common. 

Why should organisms shuffle their 
genomes through sex and recombination? Nat- 
ural selection should create an excess of good 
gene combinations, so ‘undoing’ the work of 
past selection by rearranging these genotypes 
seems counterproductive. One possible expla- 
nation is that what constitutes a good combi- 
nation of alleles (gene variants) changes over 
time. In that case, undoing the work of past 
selection is beneficial because selection in the 
future demands something different. This idea 
requires that selection on gene combinations 
changes regularly”. 

Coevolving natural enemies — particularly 
parasites — might provide just the right type 
of selection pressures for this scenario. This is 
the basis for the ‘Red Queer hypothesis, which 
proposes that sexual reproduction and recom- 
bination are favoured because they help hosts 
to adapt to the ever-shifting selection imposed 
on their gene combinations by the parasites**. 
However, even rapidly evolving parasites do 
not always induce selection for recombina- 
tion; there are times in the coevolutionary 
cycle when hosts are well adapted and non- 
recombinant offspring will be more resistant 
to infection than recombinant ones”*. Intui- 
tively, it might seem that the ideal solution is to 


I: most plants and animals, offspring are 


increase recombination when infected because 
being infected indicates that your current gene 
combination is not working. 

To test this, Singh et al. performed hundreds 
of test crosses using female Drosophila mela- 
nogaster fruit flies that carried mutations at 
each of two genes on one chromosome, but 
that had normal versions of the genes on their 
other copy of the chromosome. The presence 
of either mutation leads to visible physical 
characteristics that allow determination of 
whether one or both mutations are present in 
their offspring — because the genes are in close 
physical proximity, the normal or mutated 
versions will be inherited together unless there 
has been recombination (Fig. 1). 


Non-infected mother 


Recombinant 
offspring 


a 
Non-recombinant 
offspring 


or or 
—a- aH 


The females were injected with one of two 
bacterial pathogens (Serratia marcescens or 
Providencia rettgeri) or given a sham injection. 
By examining tens of thousands of the flies’ 
progeny, the authors found that infected moth- 
ers produced a higher fraction of recombinant 
offspring than non-infected mothers. This 
effect was seen in four fly strains. Infection 
with a parasitoid wasp (Leptopilina clavipes) 
also induced an increase in recombinant 
progeny. Unlike the bacterial experiments, in 
which reproductive adults were infected, the 
parasitoid wasp infects fruit-fly larvae and 
the parasites must be killed for the larva to 
survive to adulthood. Thus, in this situation, 
the infection is cleared long before meiosis (the 
cell division necessary to produce gametes and 
during which recombination occurs). 

An increase in the observed frequency of 
recombinant progeny from infected mothers 
could be due to an increase in the recombi- 
nation rate or to transmission distortion (for 
example, if recombinant chromosomes are 
more likely than non-recombinant chromo- 
somes to end up in successful gametes). To 
tease these possibilities apart, Singh et al. made 
use of the fact that exchange of chromosomal 
material (crossover events) occurs 4-5 days 
before eggs are laid. In their bacterial-infection 
experiments, the authors found an increase in 
recombinant progeny even in the first 4 days 
after the mothers were infected. This rapid 
response points to transmission distortion. 
A remaining challenge will be to understand 
how this distortion occurs. Are recombinant 
chromosomes less likely than non-recombi- 
nant ones to end up in polar bodies, the small 


Infected mother 


Figure 1 | Frequency of recombinant offspring altered by infection. Diploid organisms, such as fruit 
flies and humans, have two copies of each chromosome, which can vary in DNA sequence (represented 
by A versus a and B versus b) in every cell except gametes (sperm and egg cells). Gametes contain 

only one copy of each chromosome, such that fertilization results in two copies again in the offspring. 
The sequence in the offspring can be the same as the parental chromosome, or an exchange of genetic 
material between the two chromosomes during gamete production — recombination — can result in 
different sequences. Singh et al.' show that fruit-fly mothers that are infected with parasites produce more 


recombinant offspring than uninfected mothers. 
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cells that are formed during meiosis but do 
not transmit genes to future generations? Are 
gametes bearing recombinant chromosomes 
more viable, or do they somehow outcompete 
gametes that have non-recombinant chromo- 
somes? What mechanism might mediate such a 
bias? Furthermore, Singh and colleagues’ study 
focuses ona single genomic region — it will be 
of interest to assess whether similar responses 
occur elsewhere in the genome, and if not, why 
this region is particularly responsive. 

Previous work has demonstrated that patho- 
gens increase recombination in plants during 
meiotic and somatic (non-gamete) division”. 
That the few existing examples of this phenom- 
enon span plants and animals suggests that 
pathogen-induced increases in the recombi- 
nant fraction could be widespread, although 
perhaps achieved through different means, 
for example transmission distortion in flies 
but higher recombination in plants. If so, does 
this intriguing connection between patho- 
gens and natural variation in recombination 
constitute convincing evidence to support the 
Red Queen hypothesis? 

Changes in the proportion of recombinant 


COMPUTATIONAL ASTROPHYSICS 


offspring in flies and other organisms have 
been reported in response to various types of 
environmental stress (such as temperature, 
nutrition and social stress; reviewed in ref. 10), 
although rarely with the rigour of Singh and 
colleagues’ work. Is selection by parasites 
a driver of the evolution of plasticity in the 
recombinant fraction, perhaps one that spills 
over to other types of stress? Or is the observed 
response to pathogens a by-product of what- 
ever causes plasticity in response to these other 
stresses? A crucial first step towards answering 
these questions would be to obtain evidence — 
so far lacking — that recombinant offspring 
are less likely to become infected than non- 
recombinant offspring. 

Although plasticity in the recombinant 
fraction has been known for around 100 years, 
it is still poorly studied. We have only the 
crudest picture of what conditions alter the 
recombinant fraction, by how much and in 
which genomic regions. Moreover, theoretical 
models” suggest that the evolution of recom- 
bination plasticity is not easily explained for 
‘normal stresses in diploid organisms (those 
that have two copies of each chromosome, 


Monstrous galaxies 


unmasked 


The enigma of how the most luminous galaxies arise is closer to being solved. New 
simulations show that these are long-lived massive galaxies powered by prodigious 
gas infall and the recycling of supernova- driven outflows. SEE LETTER P.496 


ROMEEL DAVE 


hree billion years after the Big Bang, 

the Universe was a different place from 

today. During that epoch, known as 
cosmic noon, the average star-formation rate 
across the cosmos was 100 times higher than 
it is at present, and individual galaxies were 
growing commensurately rapidly. This was 
illustrated by the surprising discovery’, more 
than a decade ago, of galaxies whose star- 
formation rates during that era were 
1,000 times the Milky Way’s current output — 
no such galaxies are seen in the present-day 
Universe. On page 496 of this issue, Narayanan 
et al.” present numerical simulations that offer 
unprecedented clarity in understanding the 
origins of such deep-space monsters. 

These galaxies have extreme properties 
and are the most luminous in the Universe. 
However, despite their enormous total 
energy output, they are faint at optical wave- 
lengths: most of the radiation emitted by their 
stars is absorbed by a ‘mask of interstellar 


dust and re-emitted at longer wavelengths. 
Consequently, they remained undiscovered 
until the advent of surveys at submillimetre 
and radio wavelengths'. The very exist- 
ence of these submillimetre galaxies (SMGs) 
presented a challenge to models of galaxy 
formation in a cosmological framework, and 
has since sparked a vigorous debate in the field 
of galaxy-formation theory. 

Two schools of thought emerged, centred 
around the ‘merger-starburst’ and the ‘smooth- 
accretion’ hypotheses, respectively”. The 
former proposes that a given SMG is the 
product of a collision between two gas-rich 
disk galaxies — this process drives a short- 
lived (about 10° years) but spectacular burst of 
star formation during the galaxies’ coalescence. 
The latter argues that SMGs represent the most 
massive members of the entire galaxy popula- 
tion, being long-lived phenomena that are con- 
tinuously fed by gas accretion over periods of 
about 10” years. 

The merger-starburst hypothesis stems 
from a scaled-up analogy of the observation 
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including flies). Even the seemingly intuitive 
Red Queen interpretation of Singh and col- 
leagues’ results is questionable because off- 
spring will always receive half the alleles 
carried by their mother, regardless of whether 
they are recombinant or not. Although studies 
such as this shed light on variation in recom- 
bination, there is a long way to go in terms 
of fully describing this variation and under- 
standing it from both a mechanistic and an 
evolutionary perspective. m 


Aneil F. Agrawal is in the Department of 
Ecology & Evolutionary Biology, University of 
Toronto, Toronto, Ontario M5S 3B2, Canada. 
e-mail: a.agrawal@utoronto.ca 


1. Singh, N. D. et al. Science 349, 747-750 (2015). 

2. Bell, G. & Maynard Smith, J. Nature 328, 66-68 
(1987). 

3. Jaenike, J. Evol. Theory 3, 191-194 (1978). 

4. Hamilton, W. D. Oikos 35, 282-290 (1980). 

5. Peters, A. D. & Lively, C. M. J. Evol. Biol. 20, 
1206-1217 (2007). 

6. Agrawal, A. F. Evolution 63, 2131-2141 (2009). 

7. Kovalchuk, |. et al. Nature 423, 760-762 (2003). 

8. Lucht, J. 

9 

1 


. etal. Nature Genet. 30, 311-314 (2002). 

. Andronic, L. Can. J. Plant Sci. 92, 1083-1091 (2012). 

0.Agrawal, A. F., Hadany, L. & Otto, S. P. Genetics 171, 
803-812 (2005). 


that the most luminous galaxies in the present- 
day Universe are almost always involved in 
spectacular collisions’. The smooth-accretion 
hypothesis is founded on the prediction’ that, 
at early cosmic epochs, galaxies were accret- 
ing gas at extremely high rates — they could 
thus potentially sustain their excessive star- 
formation activity. 

Neither scenario has been successful in fully 
replicating the observed properties of SMGs. 
Researchers have been unable to reproduce 
the number of systems required to match 
the observations under the merger-starburst 
hypothesis®, because collisions between suf- 
ficiently large galaxies were rare at those early 
times. An influential study* has argued that 
collisions between more-numerous low-mass 
galaxies could lead to the formation of SMGs, 
but only under the assumption that the star- 
formation process stimulated by the mergers 
was heavily weighted towards the production 
of massive stars. However, this assumption was 
subsequently disfavoured by observational 
results’. 

Similarly, the smooth-accretion scenario has 
been tested in cosmological simulations that 
generated SMGs matching those observed, but 
they did not reproduce the high luminosities of 
these systems”®. Given that SMGs are thought 
to be the progenitors of the well-studied 
elliptical galaxies found in the present-day 
Universe, the inability to fit these sources 
straightforwardly into a cosmological galaxy- 
formation context has been worrying. 

In the current study, Narayanan et al. present 
a hydrodynamic simulation in a cosmo- 
logical framework that yields the first SMG 
with a luminosity that is a good match to the 
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Figure 1 | Simulation of a submillimetre galaxy (SMG). Narayanan et al.” simulate how the Universe's 
most luminous galaxies, which look extremely bright in the submillimetre part of the spectrum, may have 
formed when the Universe was 3 billion years old. This snapshot, taken from a supercomputer simulation, 
depicts the distribution of gas and light in a small region of the field: it contains a bright central galaxy 
(white) that is accreting gas along a filamentary structure (pink), a large spiral galaxy (left of centre), and 
numerous smaller galaxies that contribute to the total luminosity of the SMG. Ambient gas (blue-green), 
much of which was expelled by the galaxies at earlier epochs, gravitates towards the centre of the proto- 
SMG. This fuels the prodigious star-formation activity of the system, which is unlike anything seen in the 


present-day Universe. 


observations, and that can, by extrapolation, 
reproduce the numbers of SMGs that are 
observed. This simulation achieves levels of 
realism that previous models lacked (Fig. 1), by 
using a ‘zoom’ technique, in which the authors 
resimulate a selected region at much higher 
spatial resolution than the whole simulated 
volume. They thus obtain an accurate repre- 
sentation of the galaxy-assembly process on 
subgalactic scales, while producing models that 
retain the full cosmological context. Further- 
more, the high resolution enables the authors 
to develop a fully self-consistent description 
of galactic outflows driven by the supernova 
explosions of massive stars that eject copious 
quantities of gas into intergalactic space — a key 
aspect in regulating early galaxy growth. The 
authors use a newly developed code for radia- 
tive transfer to accurately predict the energy 
output of the SMG at submillimetre wave- 
lengths. These crucial improvements in mod- 
elling lead to a deeper understanding of SMGs. 

So what do the authors find? The key out- 
come of their simulation is a ‘long-lived’ 
SMG that can sustain star-formation rates 
of 500-1,000 solar masses per year for about 
10° years, and which has a submillimetre lumi- 
nosity that matches typical observations. Sucha 
demonstrably rapid growth in stellar mass 
results in SMGs that are among the most mas- 
sive objects in the Universe at cosmic noon. This 
realization has two corollaries: first, the strong 
gravitational attraction of the sources causes 
numerous other galaxies to cluster around 
them, thus adding non-trivially to the system's 
total submillimetre luminosity. Second, galac- 
tic outflows cannot easily escape the intense 
gravitational pull of an SMG, but instead rain 


back down on the galaxy. This process provides 
additional fuel that enhances the star-forma- 
tion rate. In a nutshell, the authors find that 
SMGs plausibly arise from a ‘perfect storm’ 
of high rates of gravitationally driven gas 
accretion, the recycling of previously ejected 
material, and contributions to the systems’ 
submillimetre luminosity from nearby galax- 
ies that cannot be well resolved observationally. 

Narayanan and colleagues’ results favour 
the smooth-accretion scenario over the 
merger-starburst hypothesis for the forma- 
tion of SMGs, and provide key insights into 


EPIGENETICS 


this decade-old debate. This does not mean 
that galaxy mergers cannot create SMGs; they 
probably do. But the current work suggests that 
they are a minority of cases. What is particu- 
larly encouraging is that the authors did not 
tune the simulations so as to reproduce SMGs: 
rather, they simply used a state-of-the-art 
galaxy-formation model and ran it at the 
highest currently feasible numerical resolu- 
tion — anda plausible SMG emerged. 

Although this study is important, it is 
unlikely to be the final word on the formation 
of SMGs. Predicting the cosmic abundance of 
such galaxies from the simulation of a single 
object still requires uncertain extrapolations, 
and the authors’ prescription for generating 
galactic outflows is not unique; other prescrip- 
tions may yield different results. These limita- 
tions notwithstanding, Narayanan et al. have 
presented the first impressively viable model 
of SMG formation, allowing us a tantalizing 
glimpse behind the mask of these behemoths 
of deep space. = 
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The karma of oil palms 


Despite their clonal origin, some oil palm trees develop fruits that give almost no 
oil. It emerges that the number of methyl groups attached to a DNA region called 
Karma determines which plants are defective. SEE LETTER P.533 


JERZY PASZKOWSKI 


egetative propagation is a form of 

\ / asexual reproduction that is routinely 
used for the commercial mass pro- 
duction of garden plants and trees, because it 
enables rapid multiplication of highly per- 
forming, genetically identical individuals. 
For certain species, vegetative propagation is 
hugely demanding and requires technically 
sophisticated aseptic cultures that produce 
large numbers of cloned embryos which can 
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develop into plantlets. But a proportion of 
plants propagated in this way display develop- 
mental abnormalities caused either by genetic 
aberrations or by epigenetic changes, which 
stably alter the expression of genes without 
affecting the underlying DNA sequence’. On 
page 533 of this issue, Ong-Abdullah et al.’ 
describe a culture-induced epigenetic defect in 
oil palms caused by specific losses in the num- 
ber of methyl groups attached to a particular 
region of their DNA. 

High-yielding varieties of oil palm, which 


ROBERT THOMPSON 


are grown in East Asia, are propagated through 
tissue-culture techniques that regenerate 
plants from specific parts of the leaf. These 
clonal, genetically identical trees are sup- 
plied to plantations. Some, however, known 
as ‘mantled’ palms, develop abnormal flowers 
and yield much less oil. But young palms need 
several years of intensive care before they start 
to fruit, and it is only then that mantling can 
be detected. 

Because of the widespread use of palm oil for 
a variety of household, food and cosmetic prod- 
ucts, the mantled defect is a serious economic 
problem. Therefore, the way in which this trait 
is inherited has been well studied. Mantled 
palms do not follow the Mendelian rules of 
inheritance, suggesting that the defect is a result 
of epigenetic changes to gene expression rather 
than a straightforward gene mutation*. Unfor- 
tunately, epigenetic changes are more difficult 
to pinpoint than genetic lesions. Nonetheless, 
some promising clues to the cause of mantling 
have been gathered. 

The flowers of mantled trees resemble those 
found in a mutated form of the model plant 
Arabidopsis. The gene that is defective in the 
mutant Arabidopsis encodes a factor essential 
for the formation of flower organs, and the 
corresponding gene in the oil palm has 
been identified as EgDEF1 (ref. 4). In mantled 
flowers, the expression of EgDEF1 is reduced’. 
Such alterations in gene expression can be 
caused by DNA methylation, an epigenetic 
modification in which methyl groups become 
attached to DNA. Because the vegetative 
propagation of palms by tissue culture causes 
a general reduction in levels of DNA methyla- 
tion’, the suppression of EgDEFI expression 
is likely to be indirectly regulated by loss of 
DNA methylation. But the question of how 
this occurs remains. 

Genes are surrounded by, and sometimes 
even peppered with, elements derived from 
ancient viruses that have invaded genomes 
over evolutionary time. Most of these elements 
became inactive and, accordingly, their DNA is 
heavily methylated. But the geneticist Barbara 
McClintock, who discovered these elements in 
maize (corn) more than 60 years ago®, found 
that they are sometimes expressed, and can 
even move to new chromosomal locations, 
where they can interfere with gene expres- 
sion. McClintock believed that these trans- 
posable elements perform key regulatory 
functions within the host genome’. EgDEF1 
contains two transposable elements that have 
been studied previously, but the levels of DNA 
methylation and the activity of these elements 
are not linked to mantling’. So the molecular 
mechanism causing mantling has remained a 
mystery — until now. 

Ong-Abdullah et al. performed an unbiased 
genome-wide search for alterations in DNA 
methylation that were tightly linked to the 
mantled trait. Key to their experimental 
design was the analysis of four groups of 
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Figure 1 | A mechanism for mantling. Propagation of oil palms can produce defective ‘mantled’ plants. 
Ong-Abdullah et al.’ report that mantling is mediated by the number of methyl groups attached to the 
DNA ofa transposable element called Karma within the gene EgDEF1.a, In healthy palms, Karma is 
heavily methylated (Good Karma). The element is inactive, and so full-length transcripts are produced — 
this includes every protein-coding sequence (coloured regions) of EgDEF1 and omits every non-coding 
sequence (black connecting lines), including Good Karma. b, A reduction in methylation leads to 

Bad Karma. This causes alternative splicing of EZDEF1 RNA to produce an extra transcript that ends at 
Karma, and reduces the number of full-length transcripts produced. The aberrant transcript might be 
translated to give a truncated protein, which may be responsible for mantling. 


palms that differed in their genetic make-up — 
this reduced the number of false positives 
and increased the accuracy of the authors’ 
approach. These genome-wide analyses yet 
again pointed to the EgDEFI gene, but this 
time methylation changes associated with 
mantling were detected in a fragment of the 
gene that had previously been overlooked. 
Ong-Abdullah and colleagues discovered 
that this fragment, which lies in a long non- 
protein-coding region, contains a third 
transposable element, called Karma. 

The researchers demonstrated that Karma’s 
DNA remains methylated in healthy plants 
(a situation they dubbed Good Karma) but 
that methylation is reduced in mantled palms 
(Bad Karma). Remarkably, Karma encodes a 
‘splice acceptor’ site — a sequence that directs 
splicing in the gene’s RNA transcript. The 
authors report that the Karma splice site is 
used only when its methylation is reduced. 
Although the mechanisms underlying this 
specificity are not known, Ong-Abdullah 
et al. did show that mantled flowers produce 
an alternatively spliced EgDEF!1 transcript 
that accumulates during flower development, 
and that may encode a truncated EgDEF1 
protein (Fig. 1). 

Taken together, these data show that the 
loss of DNA methylation at Karma, and 
the subsequent Karma-mediated alternative 
splicing, are linked to the mantled trait. It 
is still not clear whether mantling is triggered 
by the production of truncated protein, by 
accumulation of an aberrant transcript spliced 
at Karma, by areduction in EgDEFI transcript 
levels or by a combination of all these factors. 
But whatever the exact role of Karma, we now 
know that it can be Good Karma, methylated 
and harmless, or Bad Karma, demethylated 


and associated with mantling. 

Ong- Abdullah and colleagues’ finding is 
likely to provide a way to detect economically 
worthless palms much earlier than was previ- 
ously possible, enabling their timely replace- 
ment in plantations. This would be not only 
of obvious economic importance, but also of 
relevance to the environment. Oil-palm plan- 
tations take over the precious space of tropical 
forests and any increase in their productivity 
will contribute to the sustainability of palm-oil 
production. 

The results also have other key implications. 
For instance, they show that well-planned and 
performed genome-wide methylation map- 
ping can pinpoint precise spots in the genome 
of anon-model organism that are responsible 
for a trait of interest. This paves the way for 
similar studies that could shed light on the 
issue of ‘missing’ heritability*. Moreover, this 
approach might lead to more examples of key 
regulatory roles for transposable elements, 
lending support to the predictions made by 
McClintock decades ago’. m 
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Hallmarks of pluripotency 
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Stem cells self-renew and generate specialized progeny through differentiation, but vary in the range of cells and tissues 
they generate, a property called developmental potency. Pluripotent stem cells produce all cells of an organism, while 
multipotent or unipotent stem cells regenerate only specific lineages or tissues. Defining stem-cell potency relies upon 
functional assays and diagnostic transcriptional, epigenetic and metabolic states. Here we describe functional and 
molecular hallmarks of pluripotent stem cells, propose a checklist for their evaluation, and illustrate how forensic 


genomics can validate their provenance. 


tem cells, defined by dual hallmark features of self-renewal and 

differentiation potential, can be derived from embryonic and 

postnatal animal tissues and are classified according to their devel- 
opmental potency (Fig. 1). The zygote and blastomeres are totipotent', 
denoting potential to give rise to all embryonic and extra-embryonic tis- 
sues, but their developmental potential has not been captured in vitro. 
Mouse embryonic stem cells exemplify a quintessential pluripotent stem 
(PS) cell that can form all tissues of the body, but provides only limited 
contributions to the extra-embryonic membranes or placenta. As described 
in greater detail below, PS cells manifest distinct functional properties 
depending upon the conditions under which they are derived and cultured. 
Multipotent stem cells, such as the paradigmatic haematopoietic stem cell, 
are restricted to generating the mature cell types of their tissue of origin, but 
under normal physiologic circumstances will not differentiate into unre- 
lated lineages. Unipotent stem cells, such as spermatogonial stem cells 
(SSCs), share the capacity for self-renewal yet exhibit limited devel- 
opmental potential, giving rise to only a single cell type, such as sperm. 

Human PS cells correspond to a stable state allowing propagation of 
immortal pluripotent cells that can generate any cell within the body. 
Nuclear reprogramming, via somatic cell nuclear transfer and transcrip- 
tion factor transduction, demonstrates that the specialized state of a 
somatic cell can be reversed to a totipotent or pluripotent state, respect- 
ively*’. The generation of induced pluripotent stem (iPS) cells from 
somatic cells via transcription factor expression constitutes a facile route 
to generate patient-specific PS cells, and has opened new paths to model 
diseases and new prospects for regenerative medicine. Given their ver- 
satility for medical applications, PS cells command considerable atten- 
tion; therefore, defining the hallmarks of pluripotency has practical as 
well as fundamental value to biomedical research. 

In this technical review, we describe the hallmark characteristics of PS 
cells, propose a checklist of assays for assessing the function and molecu- 
lar state of pluripotency, and outline forensic genomic approaches to 
validate the provenance of reprogrammed cell lines. 


Defining pluripotent stem cells 


PS cells are self-renewing cells with the capacity to form representative 
tissues of all three germ layers of the developing embryo—ectoderm, 


mesoderm and endoderm, as well as the germ lineage, but typically 
provide little or no contribution to the trophoblast layers of placenta. 
PS cells can be derived from numerous sources (Table 1). The first PS 
cells cultured in vitro were derived from teratocarcinomas, a tumour of 
germ cell origin’. Later, derivation of PS cells from the murine blasto- 
cysts proved that pluripotent cells could be propagated as immortalized, 
non-transformed cell lines*®. PS cells have also been derived from non- 
human primate and human embryos’*, and from various stages of 
development, including the post-implantation epiblast and the germ 
line?"*. Finally, somatic cells can be reprogrammed to pluripotency by 
ectopic expression of select sets of transcription factors’. 

PS cells manifest distinct properties depending on derivation and 
maintenance conditions. PS cells established from pre-implantation 
embryos are known as ES cells, whereas those generated from slightly 
later embryonic epiblast stages are called epiblast stem cells (EpiSCs)”””. 
Their distinct culture requirements, gene expression programs and epi- 
genetic features may reflect the dynamic development of pluripotency in 
the embryo. The terms ‘naive’ and ‘primed’ were introduced to describe 
early and late phases of epiblast ontogeny and respective ES cell and 
EpiSC derivatives’*. PS cells from various sources have been classified 
accordingly (Table 1). Conventional human PS cells exhibit molecular 
attributes similar to EpiSCs and are classified as ‘primed’. Evaluation of 
naive pluripotency in humans by formation of human chimaeras is 
restricted on ethical grounds in many jurisdictions, but as conventional 
non-human primate ES cells fail to chimaerize pre-implantation 
embryos, traditional human ES cells are also probably primed by this 
criterion”®. 


Molecular hallmarks of pluripotency 
PS cells are characterized by molecular mechanisms that sustain self- 
renewal and suppress differentiation while maintaining key differenti- 
ation genes in a quiescent yet ‘poised’ state reflective of their incipient 
developmental potential. 

A select set of core transcription factors in combination governs and 
thereby defines pluripotency: OCT4 (also known as POUSF1), SOX2 
and NANOG (collectively, OSN). OCT4 and NANOG are designated as 
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Figure 1 | Stem-cell potency. a, Two cardinal assays for assessing PS-cell 
potency are blastocyst chimaerism and teratoma formation. Performance in 
these assays allows classification of totipotent, naive pluripotent, primed 
pluripotent, and multipotent developmental potentials. Totipotency is defined 
by the capacity to develop and form all tissues of the organism, including extra- 
embryonic tissues. Naive PS cells are distinguished by the capacity to form a 
teratoma and a chimaeric animal following introduction into pre-implantation 
embryos, whereas primed PS cells form teratomas but do not efficiently form 
chimaeras following introduction into pre-implantation embryos. Tissue- 
specific multipotent stem cells form cell types related to their tissue-of-origin, 
but do not form teratomas or chimaeras. Primed EpiSCs do not efficiently 
form chimaeras when introduced into blastocysts, but can contribute 

to non-viable post-implantation chimaeras. Therefore, EpiSCs also exhibit 
pluripotency when introduced into post-implantation embryos. A strict 
criterion for potency is the demonstration that a single cell can differentiate into 
the different cell types via single-cell transplantation or by genetically labelling 
test cells and demonstrating that the daughters of a single cell contribute to 
different lineages. For human PS cells, teratoma formation remains the gold 
standard functional assay. Although single-cell-derived teratomas have not 
been directly generated from diploid human PS cells, clonal-cell-line-derived 
teratomas provide indirect evidence for the developmental potential of human 
PS cells at a single-cell level. b, Checklist for assessing the function and state of 
candidate PS cells. Validating the pluripotency of novel PS cells involves 
assessment of ‘function’ by measuring self-renewal capacity and developmental 
potential, and validating pluripotency as a ‘state’ by measuring the activation 
of core pluripotency transcription factors (TFs) OCT4, SOX2 and NANOG, 
and characterization of state markers, such as marker transcription factors and 


core transcription factors based on their specific expression pattern in PS 
cells and early embryos, and genetic screens identifying their essential role 
in establishing pluripotency in mice and humans*’”*°. OCT4 functions as 
a heterodimer with SOX2, placing SOX2 among the core regulators”. The 
generation of mouse and human iPS cells by ectopic expression of OCT4 
and SOX2 highlights the pre-eminent role of OCT4/SOX2 in establishing 
pluripotency. Although NANOG is not required for mouse PS-cell 
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DNA methylation levels. For example, human ground state PS cells are 
anticipated to exhibit global DNA hypomethylation and reactivation of 
transcription factors expressed during pre-implantation development. For 
novel claims of PS cells, when possible, forensic-genomics-based approaches 
and independent reproduction in an independent laboratory should validate 
the provenance and reproducibility of pluripotent phenomena. The blue boxes 
indicate in vivo differentiation assays that should not be assessed in human 
cells; the red box indicates the uncertain relevance of X chromosome 
reactivation as a criterion for human ground state PS cells owing to the 
unresolved interpretation of X chromosome status in human naive 
pluripotency. AP activity, alkaline phosphatase activity. c, Resetting to ground 
state pluripotency. Primed PS cells exhibit high levels of DNA methylation, 
cannot chimaerize pre-implantation blastocysts, and female primed PS cells 
exhibit post-X-chromosome-inactivation status. Xa, active X chromosome; Xi, 
inactivated X chromosome. To overcome the differentiation barrier between 
naive and primed PS cells, transcription factors (TFs) are introduced into 
primed PS cells to initiate resetting. Transcription-factor-induced PS cells or 
metastable PS cells cultivated in ground state culture conditions will be reset to 
ground state pluripotency, demarcated by homogeneous expression of naive 
transcription factors and global DNA hypomethylation (low 5-mC) 
reminiscent of pre-implantation embryo cells. Globally hypomethylated 
genomes in ground state mouse ES cells resemble pre-implantation blastocysts, 
whereas serum-cultivated mouse ES cells and primed EpiSCs possess a 
hypermethylated genome reminiscent of post-implantation epiblasts and 
somatic cells. The methylation state of altered human PS cells is undefined, but 
reset cells generated by the Smith laboratory exhibit DNA methylation level 
changes closer to ground state mouse ES cells”. 


maintenance”, and is expressed at low or absent levels in mouse EpiSCs, 
it stabilizes PS cells, is necessary for in vivo pluripotency to develop in the 
inner cell mass (ICM)”*, and extensively co-localizes with OCT4 and SOX2 
throughout the mouse and human PS cell genome. While the core tran- 
scription factors define and govern pluripotency, in special circumstances 
PS cells can tolerate loss of SOX2 or NANOG or substitution with other 
factors, suggesting flexibility in pluripotency governance. Among the core 
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Table 1 | Different PS cell types and their developmental potentials 
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Criteria for pluripotency 


Starting cells Pluripotent stem cell In vitro differentiation Teratoma Postnatal Germ line 4n complementation State of Ref. 
chimaera transmission pluripotency 

ouse germline ECCs Yes Yes Yes Yes No aive 4 
tumour 

ouse oocyte Parthenogenetic ES cells Yes Yes Yes Yes No aive 101 

ouse blastomere ES cells Yes Yes Yes Yes Yes aive 102 

ouse ICM ES cells Yes Yes Yes Yes Yes aive 5,6 

ouse Epiblast EpiSCs Yes Yes No No Primed 9,10 

ouse primordial Embryonic germ cells Yes Yes Yes Yes ? aive 11,103 
germ cell 

ouse SSCs GS cells, gPS cells; MASC GS cells, gPS GS cells,gPS GS cells, GS cells, ? Naive (GS cells, 12,104 

cells, MASC cells, MASC gPScells gPS cells gPS cells) 
Primed (MASCs) 

ouse somatic cells iPS cells Yes Yes Yes Yes Yes aive 3 

ouse somatic cells Nuclear-transfer ES cells Yes Yes Yes Yes Yes aive 105 
Human germline ECCs Yes Yes No No No ? 106 
tumour 
Human oocyte Parthenogenetic ES cells Yes Yes N/A N/A N/A Primed 107 
Human blastomere ES cells Yes Yes N/A N/A N/A Primed 108 
Human ICM ES cells Yes Yes N/A N/A N/A Primed 7 
Human somatic cells iPS cells Yes Yes N/A N/A N/A Primed 109 
Human somatic cells Nuclear transfer ES cells Yes Yes N/A N/A N/A Primed 95 


All cells listed are able to differentiate in vitro. Mouse oocyte-derived, blastocyst-derived ES cells, primordial germ-cell-derived embryonic germ cells, embryonic carcinoma cells (ECCs), SSC-derived cells, and iPS 
cells are able to generate chimaeras and contribute to the germ line. N/A, not applicable; ?, unknown. GS cells, germline stem cells; MASC, multipotent adult spermatogonial-derived stem cells; gPS cells, germline- 


derived pluripotent stem cells. 


transcription factors, OCT4 has proven most indispensable and remains 
the preeminent pluripotency factor. 

Mapping of OSN targets supports a model of regulatory control 
whereby OSN sustains self-renewal while restricting differentiation. 
OSN cooperatively bind their own promoters, forming an intercon- 
nected auto-regulatory loop'”'*. OSN activate a substantial fraction of 
protein-coding, miRNA, and non-coding RNA genes in ES cells, while 
also occupying genes encoding lineage-specific regulators””’*. The pro- 
moters of many lineage regulators harbour both active (H3K4me3) and 
repressive (H3K27me3) histone marks, a bivalent state thought to facil- 
itate activation of development genes upon exit from pluripotency”. The 
capacity of OSN to activate genes necessary for maintaining ES cells, while 
repressing lineage-specifying regulators, chiefly accounts for the dual 
hallmark features of self-renewal and differentiation potential. 

While OCT4 and SOX2 are expressed in all PS cells, PS cells can be 
classified into different states of pluripotency based on a complement of 
diagnostic molecular signatures that delineate proximity to the pre- 
implantation ICM or post-implantation epiblast, respectively (Fig. 1c). 
In mice, four key distinctions amongst the various pluripotent states have 
been described to date: (1) X chromosome status in female cells; (2) global 
levels of DNA methylation; (3) Oct4 enhancer utilization; and (4) express- 
ion levels of a select group of regulators designated as ‘naive’ transcription 
factors: Klf4, KIf2, Esrrb, Tfcp211, Tbx3 and Gbx2 (refs 10, 26, 30-33). 
These naive transcription factors, along with Nanog, are expressed at low 
levels or are absent in primed PS cells and can reset primed PS cells in 
conjunction with naive pluripotency culture conditions. The capacity of 
‘naive’ transcription factors to reset primed PS cells suggests a regulatory 
intersection between naive transcriptional circuitry and epigenetic reset- 
ting of the DNA methylome and X chromosome. 

A molecular ‘ground state’ in mouse ES cells can be enforced by cul- 
tivating cells in leukaemia inhibitory factor and small molecule inhibitors 
of Mek and Gsk3 kinases (2i/LIF conditions), which stabilizes the dia- 
gnostic signatures of pluripotency in the pre-implantation blastocyst". 
Ground state ES cells exhibit two active X chromosomes in female cells, 
low levels of DNA methylation, preferential utilization of the Oct4 distal 
enhancer, and naive transcription factor expression. In contrast, an 
alternative primed state is favoured by cultivation in FGF/ACTIVIN. 
Primed EpiSCs exhibit X-chromosome inactivation in female cells, high 
levels of DNA methylation, preferential utilization of the Oct4 proximal 
enhancer, and naive transcription factor repression. The molecular 
changes observed when ground state ES cells transition to primed 


EpiSCs in vitro appear to mirror changes during maturation of pre- 
implantation epiblast to post-implantation epiblast in vivo'*”®. 

Both naive and primed PS cells exhibit heterogeneity at the level of 
state markers and single cells, which we briefly discuss below. While 
serum-cultivated mouse PS cells form chimaeras capable of germline trans- 
mission (a functional hallmark of naive pluripotency), such PS cells also 
bear high DNA methylation levels reminiscent of post-implantation epi- 
blast*”. EpiSCs also exhibit heterogeneity that can be altered via signalling 
pathway modulation. For example, region-specific EpiSCs (rsEpiSCs) pref- 
erentially engraft into posterior epiblasts and bear diagnostic markers of the 
post-implantation state, consistent with their status as primed PS cells”. 
Yet, rsEpiSCs possess higher cloning efficiency, a feature typically assoc- 
iated with naive PS cells. Thus, like serum-cultivated ES cells, rsEpiSCs 
manifest features associated with different phases of pluripotency. 
Cumulatively, these observations suggest mouse pluripotency encompasses 
a spectrum of functional and molecular states, highlighting the imprecision 
of nomenclature in the face of biological complexity. 

A caveat to the concept of ground state PS cells arises from single-cell 
studies suggesting inherent metastability in PS cells. Heterogeneous single- 
cell gene expression profiles, flow cytometry, and replating experiments 
indicate the coexistence of distinct molecular and functional states in 
serum-cultivated mouse ES cells**. Even individual cells in more homo- 
geneous ground state cultures have been reported to exhibit variable 
pluripotency transcription factor expression” and, while the origin and 
consequence of such heterogeneity are yet to be elucidated, the dynamic 
nature of pluripotency cannot be disregarded when classifying PS cell 
states. The markers distinguishing ground state from alternative PS cells 
remain relevant for evaluating novel PS cell types, especially claims of 
ground state human PS cells. 

Adding additional nuance to the definitions of pluripotency, func- 
tional and molecular states are not always correlated. Mouse PS cells 
maintain molecular features of pluripotency, including expression of the 
core transcription factors, even when DNA methylation and H3K27 
methylation are ablated***’, but cannot differentiate, and thereby lack 
functional pluripotency**. Thus, while molecular signatures can suggest 
pluripotency, only functional tests can establish the true developmental 
potential of a cell. Unlike mouse ES cells, conventional ‘primed’ human 
ES cells cannot tolerate DNMT1 deletion, emphasizing the functional 
differences between mouse and human ES cells, which we discuss in 
detail below**. The observation that naive cells tolerate depletion of 
epigenetic regulators supports the concept of naive pluripotency as a 
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configuration with a reduced requirement for epigenetic repression 
compared to primed PS cells and somatic cells. 


Functional assessment of pluripotency 

A range of assays can be employed to reveal the developmental potential 
of PS cells: (1) in vitro differentiation; (2) teratoma formation; (3) chi- 
maera formation; (4) germline transmission; (5) tetraploid complemen- 
tation; and (6) single-cell chimaera formation. A summary of these assays 
along with their advantages and disadvantages is provided in Box 1. 

In vitro differentiation to derivatives of all three embryonic germ 
layers—ectoderm, mesoderm and endoderm—represents the lowest 
hurdle for establishing pluripotency. Typically, culture conditions 
that maintain pluripotency are replaced by cocktails of differentiation- 
inducing cytokines, morphogens or chemicals, and markers of specific 
target tissues are then surveyed. 

The teratoma formation assay assesses the spontaneous generation of 
differentiated tissues from the three germ layers following the injection 
of cells into immune-compromised mice. Histologic analysis of terato- 


BOX | 
Functional assays for pluripotency. 


mas is neither quantitative nor capable of assessing every possible cell 
type. Incompletely reprogrammed cells can generate masses that 
superficially resemble teratomas yet lack terminal three-germ-layer 
differentiation, potentially leading to misinterpretation**. Moreover, 
co-injection with matrices or scaffolds can elicit inflammatory or for- 
eign-body reactions that can be misinterpreted as evidence of tissue 
differentiation, necessitating the use of lineage tracing or marker ana- 
lysis to distinguish donor cells from reactive host tissue”*. Because ter- 
atomas are not generated from single cells, the teratoma assay assesses 
developmental potency at a population-based level. 

A third differentiation assay, blastocyst chimaera formation, measures 
whether test cells can re-enter development when introduced into host 
embryos at either of two pre-implantation stages: by aggregation with 
cleavage-stage morulas or by injection into blastocysts’’. High-quality PS 
cells support normal development and generate high-grade chimaeras with 
extensive colonization of all embryonic tissues including the germ line, 
whereas less-potent PS cells produce either low chimaerism or reduced 
embryo viability. 
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Functional grades of pluripotency and totipotency 


a, Overview of functional tests to assess developmental potency of PS cells. Blue boxes indicate assays that are restricted using human cells. 

b, Functional assays for pluripotency, their grades of functional stringency, and ethical permissibility when using human cells. Analysis of in vitro 
characteristics, such as self-renewal capacity, colony morphology (CFU, colony-forming unit), and differentiation capacity in vitro, comprise a basic 
layer of pluripotency characterization. /n vivo assays that measure differentiation capacity are taken as more robust indicators of potency. 

Mouse PS-cell potency evaluation includes aggregate in vivo assays (that is, teratoma formation, embryo chimaeras (non-gestation), germline 
transmission, 2n/4n gestational complementation) and single-cell in vivo assays (that is, single-cell chimaeras and single-cell input gestations). 

An tetraploid complementation and single-cell chimaera formation are taken as more stringent functional assays for pluripotency. 

The teratoma assay is the gold standard functional assay for assessing human PS-cell developmental potential. Chimaerism assays of human PS cells 
in murine embryos, as well as formation of primary human embryo chimaeras (non-gestation), are permissible under international stem-cell 
research guidelines'?° after rigorous scientific and ethical review. Potency evaluation of primary human chimaeras by in vivo gestational 


complementation in humans is ethically impermissible. 


The assays for totipotency are: (1) gestation from a single input cell; and (2) gestational complementation experiments from a single cell that 
demonstrate contribution to all tissues of the body and high-grade placenta contribution. Note that it is not necessarily the case that if a test cell 
performs well in a more stringent test, that it will definitely pass a less stringent test. For example, it is unclear if totipotent cells form teratomas. 


472 | NATURE | VOL 525 | 24 SEPTEMBER 2015 


©2015 Macmillan Publishers Limited. All rights reserved 


A fourth assay, germline transmission, entails breeding chimaeras to 
produce all-donor PS cell-derived offspring, which thus demonstrates 
the capacity of test cells to generate functional gametes. The integration 
of donor cells into all tissues of viable late-stage embryos, postnatal or 
adult mice, followed by germline transmission, is a robust indicator of 
chromosomal integrity and of functional pluripotency. 

A fifth assay applied to mouse cells, tetraploid complementation, 
measures the capacity of test PS cells to direct development of an entire 
organism. Donor PS cells are introduced into tetraploid (4n) host blas- 
tocysts, which are generated by electrofusion of blastomeres at the two- 
cell stage. Because 4n blastocysts cannot sustain normal embryonic 
development beyond mid-gestation, while tetraploid extra-embryonic 
tissues develop normally and support donor cells**, any resulting 
embryos are derived essentially entirely from donor PS cells. 

A sixth, highly stringent assay is to inject single-donor mouse PS cells 
into a morula or blastocyst”. Genuine pluripotency is a property of a 
single cell and therefore chimaeras with widespread contribution from a 
single injected cell provide the clarity of clonal analysis. Both single-cell 
chimaerism and tetraploid complementation assays suffer from higher 
failure rates, but can be interpreted as the most definitive ways of dem- 
onstrating pluripotency. 

Finally, while primed EpiSCs generate tri-lineage differentiation in 
vitro and form teratomas, EpiSCs rarely form chimaeras upon introduc- 
tion into pre-implantation blastocysts. However, EpiSCs contribute to 
all germ layers when introduced into early post-implantation embryos 
in whole-embryo culture*””’, although pluripotency of single cells has 
not yet been demonstrated. 


Human pluripotent stem cells 


Conventional human PS cells exhibit molecular hallmarks of primed 
state pluripotency, including preferential utilization of the OCT4 prox- 
imal enhancer, pronounced levels of DNA methylation, and a propen- 
sity for X chromosome inactivation in female cell lines’!. Reports of 
human naive PS cells prompted some groups to attempt to assess 
potency by blastocyst chimaerism***’, constrained by the widespread 
acceptance that culture of human embryos for more than 14 days of 
development in vitro, or past the point of primitive streak formation 
(whichever is first), is ethically impermissible. Nevertheless, both 
primed and altered human PS cells have been introduced into mouse 
pre-implantation embryos***. Human naive PS cells engraft into the 
mouse ICM****, although contribution to cross-species chimaeras has 
been minimal* or not detectable***. By contrast, region-specific human 
PS cells engraft into the posterior epiblast of cultured murine post- 
implantation embryos, indicating limited cross-species chimaerism”’. 

More compelling evidence for cross-species blastocyst chimaerism 
has been reported following injection of primate naive iPS cells into 
mouse blastocysts, leading to clonal contribution to solid tissues”. 
Whereas primate ICM cells have thus far failed to form blastocyst chi- 
maeras, unlike mouse ICM cells'’, aggregation of primate blastomeres 
(totipotent cells) does produce chimaerism'®. Nonetheless, a recent study 
described altered primate PS cells that can incorporate into host embryos 
and develop into chimaeric fetuses with low-grade contribution to all 
three germ layers and early germ cell progenitors”. As in mice, high-grade 
contribution and germline transmission remain as more stringent tests to 
demonstrate naive pluripotency in primate ES cells. 

Given the distinct behaviour of primate PS cells in chimaera studies, 
and lingering uncertainties about interspecies chimaerism, injecting 
human cells into mouse embryos needs additional validation before 
being accepted as a routine assay for stem-cell potency. Lacking robust 
functional assays for human stem-cell potency, transcriptional and epi- 
genetic similarity of hypothetical ground state PS cells to the pluripotent 
cells in human pre-implantation embryos will remain the molecular 
standard for designation of human ground state PS cells (Fig. 1). 

Erasure and resetting of DNA methylation is a molecular hallmark in 
mammalian pre-implantation and germline development. Human pre- 
implantation embryos have hypomethylated genomes. In contrast, ICM 
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outgrowths undergo genomic remethylation and established human ES 
cells maintain pronounced DNA hypermethylation, similar to mouse 
primed PS cells***’. Such epigenetic resetting appears to be controlled by 
a unique regulatory network present in pre-implantation embryos and 
the germ line. KLF4, TFCP2L1, ESRRB, TBX3 and GBX2, transcription 
factors implicated in mouse naive pluripotency, have been detected in 
human pre-implantation epiblast and are transcriptionally repressed in 
derived human ES cells, similarly to mouse EpiSCs®. However, the 
transcripts of certain murine naive transcription factors, such as 
KLF2, have not been detected in the human pre-implantation epiblast, 
revealing complexity. Additional species-specific differences also 
remain unresolved. The timing of X chromosome inactivation in human 
embryos is contentious*’” and ‘epigenetic erosion’ of the X chro- 
mosome in primed human ES cells complicates our understanding of 
X chromosome regulation’. Therefore, by current standards, we 
identify human ground state or naive PS cells according to molecular 
criteria used to delineate mouse ground state pluripotency, accepting 
that these criteria are tentative and subject to revision. 

Acknowledging such caveats, a growing number of studies have 
demonstrated the feasibility of altering human PS cells towards a ‘meta- 
stable’ naive state of pluripotency*’**”. More convincingly, PS cells 
generated by the Jaenisch and Smith laboratories express transcription 
factors implicated in the governance of mouse ground state ES cells****. 
While the X chromosome was inactive in human PS cells generated in 
the Jaenisch laboratory, we note again the uncertain significance of X 
chromosome status in human pluripotency*”°**'"™. Cells ‘reset’ in the 
Smith laboratory exhibit a meaningful reduction in DNA methylation to 
levels approaching human pre-implantation embryos. However, the 
unclear activation of the OCT4 distal enhancer, and lack of detailed 
characterization of transgene-independent cell lines leaves open the 
question of whether the reset state is stable™. 

More experimental understanding of the transition from totipotency 
to pluripotency in the intact human or primate embryo will be needed to 
truly define the human ground state PS cell. Direct derivation of ground 
state ES cells from human embryos would be a landmark, highlighting 
the continued relevance of human ES cell research. 


Potency in native somatic cells 


As an organism progresses from the earliest embryonic stages to adult- 
hood its cells become progressively restricted in developmental potency, 
and acquire epigenetic modifications that present barriers to dediffer- 
entiation. However, germ cells, responsible for perpetuating the species, 
retain a unique chromatin state receptive to reprogramming to a naive 
pluripotent state by signalling pathway modulation alone. Cultivation of 
primordial germ cells in 2i/LIF, among other culture conditions, gen- 
erates chimaera-competent naive pluripotent cells®. 

By contrast, acquisition of naive pluripotency from somatic cells 
requires the prolonged, combinatorial action of reprogramming tran- 
scription factors and ES cell growth conditions’. An exception to this 
principle is chemical reprogramming, suggesting that culture conditions 
alone can fully reverse the differentiated state to pluripotency®. Notably, 
the final stage of chemical reprogramming is also induced by 2i/LIF. In 
contrast to mouse cells, our current capacity to generate human PS cells 
by signalling pathway modulation alone is more limited. The pluripo- 
tency of human embryonic germ cells and adult testis-derived human PS 
cells, both generated by culture of human germ cells, remains conten- 
tious, and small-molecule-based reprogramming of human somatic 
cells to pluripotency has not yet been demonstrated”””’. 

Alterations in cellular identity can accompany human disease. Chronic 
exposure to stomach acid from gastro-oesophageal reflux converts strati- 
fied squamous epithelium of the distal oesophagus to goblet-cell contain- 
ing columnar epithelia more typical of the intestine, a condition termed 
Barrett’s oesophagus, which predisposes to adenocarcinoma. Metaplasia 
and other forms of tissue ectopias, where aberrant tissues form in unusual 
locations, suggest cell identity conversion occurs in the body. Thus it is 
intriguing to consider various claims of pluripotency for cells isolated from 
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perinatal or somatic tissues, such as multipotent adult progenitor cells”*”*, 
very small embryonic-like cells”, multi-lineage differentiating stress- 
enduring cells’*, and endogenous pluripotent stem cells”. When consider- 
ing novel claims of expanded potency a strict criterion is demonstration 
that a single cell can differentiate into different cell types, a standard of 
clonal analysis lacking in most studies. 


Evaluating totipotency features 


A robust, bidirectional capacity to form both embryonic lineages and 
extra-embryonic trophoblast layers of the placenta, as well as yolk sac 
derivatives, distinguishes totipotency from pluripotency. While somatic 
cells are reset to totipotency following nuclear transfer into oocytes, to 
date no lab (to our knowledge) has claimed to propagate in vitro cells 
with totipotency equivalent to zygotes or blastomeres. Below, we briefly 
review previous claims of placental differentiation capacity in PS cells 
and propose how one might evaluate claims of totipotency (Table 2). 

The most stringent demonstration of totipotency requires that a single 
cell produce a term birth under experimental conditions” *', a standard 
achieved in rodents and in non-human primates for single blastomeres 
extracted from pre-implantation embryos'*'. Later-stage blastomeres 
may contribute to all embryonic and extra-embryonic tissues, and yet fail 
to support a viable conceptus because of reduced cell numbers at the 
blastocyst stage. Thus, an alternate and less stringent test of totipotency 
is the potential of genetically marked single cells to contribute extensively 
to both embryonic and extra-embryonic lineages after introducing donor 
cells into pre-blastocyst-stage embryos. In the mouse, for example, only 
isolated two-cell blastomeres can generate an entire conceptus™, but sin- 
gle blastomeres at the eight-cell stage still manifest totipotency in aggrega- 
tion chimaeras'. Sister blastomeres of a four-cell stage human embryo can 
develop individually into blastocysts with ICM and trophectoderm cells*. 
An essential feature of these functional tests of totipotency is demonstra- 
tion of developmental capacity at the single-cell level. 

Mouse PS cells with bidirectional developmental capacity for extra- 
embryonic and somatic fates have been claimed following specific 
genetic (for example, Dnmt1 knockout**) or cell culture modifications 
(for example, ground state***°) (summarized in Table 2). ‘In vivo repro- 
grammed’ iPS cells purportedly contribute to the placenta, unlike ES 
cells or in vitro reprogrammed iPS cells**. These studies reported differ- 
entiation into trophoblast-stem-like cells and the formation of blasto- 
cyst-like structures. However, the in vivo chimaera potential of 
trophoblast-stem-like cells was not assessed. Further, single cells did 
not yield robust high-grade contribution to the placenta”. Thus, the 
definitive functional criterion for establishing totipotency, single-cell 
contribution to the trophoblast and ICM lineages, has not yet been 
demonstrated. The molecular changes associated with acquisition of 
totipotent-like developmental potential have differed across studies 
and include the expression of ‘2C-specific’ genes, morula-specific 
genes, and extra-embryonic transcription factors. Therefore, by current 


standards, accepting that the relevance of these molecular criteria are 
tentative and subject to revision, the essential criterion of totipotency 
remains functional, whereby a single cell generates both ICM and 
trophectoderm fates in a transplantation assay. Ideally, detailed 
assessment of embryonic and extra-embryonic tissues should be made 
late in gestation, so that extensive and functional contribution can be 
demonstrated. 

Conventional primed human PS cells reportedly form both trophec- 
toderm and primitive endoderm-like derivatives in vitro*’. However, 
confirmation of the identity of these derivatives has proven challen- 
ging®*. Injection of human naive PS cells into mouse embryos has not 
resulted in contribution to ICM and trophectoderm lineages. Future 
claims of mouse totipotent stem cells will require stringent functional 
and molecular validation, while in humans, molecular criteria and com- 
parison to primate species will have to suffice to establish plausibility. 


Assessing provenance and potency via genomics 


Advanced sequencing platforms have allowed researchers to generate a 
multitude of genomic and epigenomic data (for example, RNA sequen- 
cing (RNA-seq), chromatin immunoprecipitation sequencing (ChIP- 
seq) and bisulfite sequencing), enabling a more comprehensive descrip- 
tion of cellular identity. Systems-level analyses have confirmed that 
direct reprogramming of somatic cells largely re-establishes molecular 
signatures associated with ES cells*’®. These analyses also detected 
low-fidelity reprogramming, such as in intermediates and cells with 
epigenetic memory*””’. Recently, genomic analyses have proven instru- 
mental in defining ground state pluripotency. Thus, while not required 
for routine characterization of PS cells, genomic analyses play a 
critical role for benchmarking novel claims of reprogramming and PS 
cells (Box 2). 

DNA sequencing also provides genetic fingerprints that can eliminate 
cell contamination as a confounder of reported results. Because cell line 
contamination is widespread, applying such genotyping methods to 
confirm cell line provenance is appropriate’’. In the case of the STAP 
cell phenomenon, the authors reported acid-reprogrammed PS cells 
with features of totipotency. Our re-analysis of genomic data revealed 
unexpected mismatches in sex and genotype between donor somatic 
cells and converted STAP cells”. Further analysis of a STAP-derived 
cell line, Fgf4-induced stem cells, revealed a mixture that contained 
trophoblast stem cells, explaining the high-grade placenta colonization 
reported for Fgf4-induced stem cells. These findings are consistent with 
and extend the results of an extensive whole-genome sequencing ana- 
lysis of STAP-related samples for the RIKEN investigation”’, which 
found contamination of purported STAP stem-cell lines with embryonic 
stem cells of a different genetic background™. 

By contrast, forensic genomics applied to sequencing data from two 
reports of nuclear-transfer-derived human ES cells (NT-hESCs) have 
confirmed cell line provenance”*”®. Inferred genome-wide single nuc- 


Table 2 | Stem cells with reported bidirectional developmental potential 


Critera for totipotency 


Cells Genetic manipulation Embryonic Placenta Yolk sac 


In vitro differentiation Trophoblast stem-cell derivation Single-cell 


Molecular features Ref. 


contribution contribution contribution into trophoblast injection 

Dnmt1 KO ES cells Dnmt1 KO ES cells ? Yes 2 Yes No No  Co-expression of Oct4 84 
and Cdx2 upon 
‘differentiation’; 
hypomethylation of 
Elf5 promoter 

‘2C’ ES cells/Kdmla_ 2C reporter Yes Yes Yes Not tested/No Not tested/No No Activation of 2C genes. 85 

ES cells 

Hex + 2i ES cells Hex reporter Yes Yes Yes Cdx2* trophoblast Not tested/No Yes Co-expression of 39 

differentiation embryonic and extra- 

embryonic genes 

In vivo reprogrammed OSKM cassette Yes Yes Yes Yes TSC-like cells, No  Morula gene signature 86 


iPS cells 


KO, knockout; TSC, trophoblast stem cell; ?, unknown. 
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BOX 2 
Forensic genomics for potency and 
provenance of PS cells. 


Evaluating potency via genomic analyses. 

Transcriptome analysis. Computational analysis can quantify the 
extent to which an experimental protocol converts a parental cell 
towards the target cell. For example, PluriTest!!! defines a 
pluripotency-specific signature based on a compendium of 
expression data sets from pluripotent and non-pluripotent cells and 
evaluates the presence of this signature in a given sample. CellNet is a 
bioinformatics algorithm that assesses the fidelity of cell fate 
conversion using cell- and tissue-specific gene regulatory networks?! 

Epigenomic characterization. Genome-wide profiling of chromatin 
features (for example, histone modifications, transcription factor 
binding, DNase | hypersensitivity, and DNA methylation) captures the 
epigenetic landscape of PS cells and the transitions that occur during 
reprogramming or differentiation. Combining gene expression and 
epigenetic maps can provide mechanistic insights into the fidelity of 
reprogrammed PS cells. For example, some mouse iPS cells fail to re- 
establish bivalent domains at developmental loci, which might reduce 
developmental competence for all tissue types®?. This failure could 
only be detected via epigenetic analysis. 

Genomic integrity. Whole-genome sequencing (WGS) allows 
comprehensive identification of single nucleotide polymorphisms 
(SNPs) and a wide range of structural variations including copy 
number variants, copy-neutral events (Such as translocations and 
inversions), and viral insertions, at base-pair resolution. Comparison of 
genomic variants before and after reprogramming will locate genomic 
alterations that may occur during reprogramming and potentially 
impact cellular function. 

Genotyping for cell line provenance and contamination. 

Provenance. Comparing the genotype of reprogrammed PS cells to 
parental cells enables verification of provenance. Genome-wide SNP 
arrays characterize a known set of SNPs and are sufficient for 
matching two samples. Based on the intensities of the probe hybridi- 
zation reaction for each SNP and the ratio of the intensities between 
the two alleles, itis possible to estimate allele-specific copy numbers in 
addition to SNP genotypes. Sequencing-based assays such as exome 
or whole-genome sequencing can provide a more comprehensive 
characterization of SNPs; genotype information can also be inferred 
from RNA-seq and possibly other functional genomics data. Analyses 
of other types of genome variation such as microsatellites can also be 
used as a form of genetic fingerprinting. Microsatellite profiling, for 
example, is recommended by the International Cell Line 
Authentication Committee (ICLAC) for cultured cell lines??*. 

Contamination. Genome-wide SNP data can also be used to examine 
genetic heterogeneity of cell line cultures and to detect contamination 
with another cell line. Fora homogeneous population, we expect to see 
sharp allele frequency (alternative over reference allele frequency 
ratio) distributions with a dominant peak near O (homogyzous 
reference) and smaller peaks at ~0.5 (heterozygous) and 1 
(homozygous alternative). When there is contamination by cells from 
different individuals or strains, we expect to observe small peaks at low 
allele frequencies (for example, at 0.05 and 0.1 if there is 10% 
contamination), corresponding to the alternative alleles from the 
second population of cells. With sequencing data, these estimates can 
be derived with greater precision, using both annotated SNPs and 
novel single-nucleotide variants. Sample contamination was detected 
in the STAP data with this analysis.°9 


leotide variants (SNVs) from exome sequencing data classified samples 
generated in the Egli laboratory as genetically similar or dissimilar 
(Fig. 2). Parental donor fibroblasts and NT-hESCs possessed similar 
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SNV profiles, consistent with nuclear transfer origin. Independently 
sourced in vitro-fertilization-derived ES cells and parthenogenetic ES 
cells manifest distinct genetic provenance from parental donor fibro- 
blasts and NT-hESCs, as expected. SNV genotyping also confirmed 
previously reported patterns of recombination in human parthenogen- 
etic ES cells, concordant with observations in mouse parthenogenetic 
ES cells””*. Matching genotypes between parental fibroblasts and 
reprogrammed NT-hESCs were also confirmed in RNA-seq data gen- 
erated in the Mitalipov laboratory (Supplementary Fig. 1). Collectively, 
these analyses support appropriate provenance of NT-hESCs and 
exclude a parthenogenetic origin for NT-hESCs. 


Reproducibility of computational analyses 

As genomic analyses can validate the provenance and confirm molecu- 
lar signatures of novel PS cells, we advocate posting of relevant genomic 
data, metadata, and full details of computational analysis upon manu- 
script publication. Deposition of sequencing data to public repositories 
such as the Gene Expression Omnibus (GEO; http://www.ncbi.nlm.- 
nih.gov/geo/) and Short Read Archive (SRA; http://www.ncbi.nlm.nih.- 
gov/sra) is required by most peer-reviewed journals, but enforcement of 
sharing policies is highly variable, and complicated by the complexities 
of experimental design and data. Consequently, verification that full 
data and associated metadata have been deposited often requires expert- 
ise and time beyond what is available during peer review. Greater com- 
pliance by the stem-cell community in depositing all relevant genomic 
data and metadata as well as consistent enforcement by journals will 
promote reproducibility of results. We also recommend deposition of 
‘intermediate’ data, the key steps and results obtained in the data ana- 
lysis process. For full reproducibility of computational analysis, 
we also advocate release of the computer code, through a supplemen- 
tary website or open source code management tools. We note that 
genomic analysis and availability of data, metadata, and methods are 
especially important for novel claims of reprogramming and altered 
stem-cell states. 


Conclusion and future prospects 


Here, we articulate a consensus definition of pluripotency predicated on 
both functional assessments of differentiation potential and diagnostic 
molecular signatures. Such an integration of functional and molecular 
hallmarks of pluripotency provides for a robust set of criteria against 
which to validate claims of pluripotency achieved by novel experimental 
strategies. Given the central role of core transcription factors in repro- 
gramming somatic cells and maintaining the pluripotent state, failure to 
observe ES-cell-like levels of these transcription factors in studies assert- 
ing functional pluripotency from novel sources should merit scepticism 
and should be accompanied by strong evidence for alternative gene 
regulatory networks and mechanisms that maintain the unique pluri- 
potent state of the mammalian genome. Another example of uncoupling 
between molecular and functional hallmarks is a report that overexpres- 
sion of cell adhesion molecules such as E-cadherin can endow primed PS 
cells with the capacity to chimaerize the pre-blastocyst, with no evidence 
of resetting to naive pluripotency”. Conversely, recent reports suggest- 
ing that reprogramming transitions through a transient state that mole- 
cularly resembles naive pluripotency, but without functional hallmarks 
of naive pluripotency, might not comprise bona fide naive pluripo- 
tency'”’. While most labs deriving PS cells for routine use need not 
employ the comprehensive set of assays reviewed here, claims of novel 
states of potency or new means of deriving PS cells necessitate more 
comprehensive characterization and documentation. 

Documentation of PS-cell states that span the continuum between 
ground state pluripotency and primed pluripotency provokes the ques- 
tion of how to define the human ground state. Further, reports that 
human PS cells can be ‘reset’ imply the feasibility of generating PS cells 
with bona fide totipotency. Ultimately, refined molecular benchmarking 
of reprogramming and more predictable experimental capture of altered 
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Figure 2 | Genomic provenance of nuclear 
transfer human embryonic stem cells (NT- 
hESCs). a, Single nucleotide sequence variants 
(SNVs) inferred using exome sequencing data 
using the human reference genome GRch37. The 
selected SNVs are classified as homozygous for 
reference allele (0/0 genotype), homozygous 
for alternative allele (1/1 genotype) or 
heterozygous (0/1 genotype). Samples are clustered 
based on the sum of the edit distance between 
each SNV. The six different genotypes in three 
Group B groups can be discerned: group A (BJ fibroblast and 
BJ fibroblast-reprogrammed human pluripotent 
stem-cell lines); group B (1018 fibroblast and 1018 


Group A 


| TT il Group C fibroblast-reprogrammed human pluripotent 


stem-cell lines); and groups C-F (human 
parthenogenetic embryonic stem cells). 

b, Genome-wide SNP genotyping of a 

Group F representative clone of NT-hESCs (Egli laboratory 
exome sequencing data) excluding 
parthenogenetic origin. Panels show genotypes for 


Group D 
Group E 


each chromosome, from centromere to telomere 
revealing blocks or haplotypes of markers. Mb, 
megabases. c, Genome-wide SNP genotyping of a 
representative clone of parthenogenetic (meiosis I) 


Genetic distance (Mb) 


-100 4 


-100 4 


Genetic distance (Mb) 
|| 
I 


human embryonic stem cells (p(MI)-hES cells) 
(Egli laboratory exome sequencing data). Panels 
show genotypes for each chromosome, from 
centromere to telomere, revealing blocks or 
haplotypes of markers. Pericentromeric 
heterozygosity is consistent with a meiosis I 
parthenogenetic ES cell. 


a 
123.45 6 7 8 9 101112131415 161718 192021 22 X 7234.5 67 B 9 1011121314 151617 18 19.2021 22 X 
NT-hESCs p(MI)-hESCs 
Homozygosity Heterozygosity 


0 0.1 0.2 0.3 0.4>0.5 


pluripotent states requires a more sophisticated understanding of 
human pre-implantation development. 

For lasting scientific impact, claims of reprogramming and altered 
states of pluripotency should be broadly applicable to more than one 
experimental model and be independently replicated by multiple 
laboratories. Before publication, we encourage that researchers claiming 
landmark reprogramming advances first demonstrate replication by 
independent laboratories and incorporate forensic genomic analyses 
to confirm appropriate cell provenance. Science is ultimately a self- 
correcting process where the scientific community plays a crucial and 
collective role. 


Received 6 May; accepted 26 August 2015. 
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Epicardial FSTLI reconstitution 
regenerates the adult mammalian heart 


Ke Wei!?*, Vahid Serpooshan**, Cecilia Hurtado’, Marta Diez-Cufiado!*?, Mingming Zhao*, Sonomi Maruyama‘, 
Wenhong Zhu!, Giovanni Fajardo*, Michela Noseda®, Kazuto Nakamura‘, Xueying Tian®, Qiaozhen Liu®, Andrew Wang’, 
Yuka Matsuura’, Paul Bushway”’, Wending Cai 2 Alex Savchenko”, Morteza Mahmoudi?”’, Michael D. Schneider”, 
Maurice J. B. van den Hoff®, Manish J. Butte’, Phillip C. Yang’, Kenneth Walsh’, Bin Zhou°”?, Daniel Bernstein’, 


Mark Mercola’? & Pilar Ruiz-Lozano* 


The elucidation of factors that activate the regeneration of the adult mammalian heart is of major scientific and therapeutic 
importance. Here we found that epicardial cells contain a potent cardiogenic activity identified as follistatin-like 1 (Fstl1). 
Epicardial Fstl1 declines following myocardial infarction and is replaced by myocardial expression. Myocardial Fstll does 
not promote regeneration, either basally or upon transgenic overexpression. Application of the human Fstll protein 
(FSTL1) via an epicardial patch stimulates cell cycle entry and division of pre-existing cardiomyocytes, improving 
cardiac function and survival in mouse and swine models of myocardial infarction. The data suggest that the loss of 
epicardial FSTL1 is a maladaptive response to injury, and that its restoration would be an effective way to reverse 
myocardial death and remodelling following myocardial infarction in humans. 


The epicardium of the heart is an external epithelial layer that con- 
tributes to myocardial growth during development by providing pro- 
genitor cells'* as well as mitogens, including FGFs, IGF2, and 
PDGFs**. Recent studies suggest that the epicardium might also pre- 
serve function of the adult myocardium following injury, possibly as a 
source of myogenic progenitors®’. To our knowledge no epicardial- 
secreted factors have been shown to support adult myocardial regen- 
eration in mammals to date. 


Epicardial signal activates cardiomyocyte division 


We co-cultured an epicardial mesothelial cell (EMC) line with Myh6~ 
mouse embryonic stem cell-derived cardiomyocytes (referred to as 
mCMs§*°“; Extended Data Fig. 1, Supplementary Videos 1 and 2, and 
Methods). Co-cultures consistently increased the number of cardio- 
myocytes (o-actinin* cells, Fig. la-c) and the expression of cardio- 
myocyte markers (Fig. 1d). Conditioned media from EMC cultures 
recapitulated this effect (Fig. le-h). The number of a-actinin® cells 
exhibiting rhythmic Ca*" transients also increased with the addition 
of EMC media (8.6-fold) (Fig. 1i), as quantified automatically by 
kinetic imaging cytometry. Similarly, conditioned media prepared 
from adult epicardial-derived cells* increased proliferation and nearly 
doubled the incidence of aurora B kinase in the cleavage furrow con- 
necting adjacent embryonic cardiomyocytes (Tnnt2" cells; 0.19 to 
0.33%, P < 0.05, Fig. 1j-m), indicating a secreted activity in the adult 
epicardium that promotes cytokinesis of embryonic cardiomyocytes. 


Engineered epicardium improves function after injury 


We next evaluated the effect of epicardial-secreted factors in the adult 
injured heart by delivering conditioned media in three-dimensional 


collagen nano-fibrillar patches’. Patches were designed with an elastic 
modulus emulating the embryonic epicardium (E ~12 kPa)’, lower 
than the mature epicardium (E> 30-40 kPa) and fibrotic cardiac 
tissue (E > 100 kPa), but higher than those for the most currently 
used scaffolding biomaterials (E = 1 kPa) (Fig. 1n, 0). Patches seeded 
with EMC-media (33% of total volume) were sutured onto the heart 
immediately following surgical-induced myocardial infarction (MI, 
permanent ligation of the left anterior descending LAD coronary 
artery, Fig. 1p, q). Two weeks later, patch-treated hearts (both with 
or without EMC-media) showed improved morphometric para- 
meters (Fig. 1r-t and Extended Data Table 1), consistent with collagen 
patch providing a mechanical support that inhibits remodelling’. 
Notably, only patch with EMC media treatment improved cardiac 
function (Fig. lr, s and Extended Data Table 1). 


Fstll is an epicardial cardiomyogenic factor 


To identify bioactive proteins, we analysed EMC-conditioned media 
by mass spectrometry. Comparison of spectra to the IPI rat database 
identified 1,596 peptide reads corresponding to 311 unique proteins. 
Ten secreted proteins with the highest spectral counts were selected 
for testing in the mCMs'*“ assay. Of these, cardiogenic activity was 
noted only with follistatin-like 1 (also known as Fstl1, FRP or TSC36, 
accession number: NP_077345.1) (Fig. 2a). Unlike follistatin, Fstl1 
does not block activin and its biochemical and biological functions 
are poorly characterized". Fstll levels increase in the blood stream 
following acute MI and, for this reason, it has been considered a 
biomarker for acute coronary syndrome”. 

Treating mCMs’*© for 8 days with bacterially synthesized 
recombinant human FSTLI1 (10 ng ml‘) increased the number of 
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Figure 1 | Epicardial secretome has cardiogenic activity, and improves 
cardiac function after MI via embryonic epicardium-like patches. a—d, Co- 
culture of mCMs"*© cardiomyocytes with epicardial EMC cells. 

a, b, Representative micrographs. c, d, Quantification of myocyte number 

(c) and cardiac gene expression (d). *P < 0.05 compared to acellular (EMC) 
control; +P < 0.05 compared to 1 x 10° cells condition. e-i, Culture of 
mCMs**“ cardiomyocytes with EMC-conditioned media. Representative 
micrographs (e, f). Quantification of myocyte number (g), cardiac gene 
expression (h), and cardiomyocytes with rhythmic calcium transients 

(i). *P < 0.05 compared to control. j-m, Effect of adult epicardial media on 
embryonic cardiomyocytes from E12.5 GFP* cells (Tnnt2-cre;Rosa26""™"*) 
(j). Conditioned media obtained from adult epicardial-derived cells (EPDCs) 
promotes cardiomyocyte proliferation that can be heat-inactivated (k) and 


cardiomyocytes (Fig. 2b-d), the transcription levels of myocardial- 
specific proteins (Myh6, Mlc2v, and Mlc2a, Fig. 2e), and the number 
of a-actinin® cells with rhythmic Ca** transients (Fig. 2f). FSTLI 
treatment did not induce hypertrophy. Indeed FSTL1 decreased myo- 
cyte cell size in a dose-dependent manner (Fig. 2g) in 48h. Thus, 
FSTL1 recapitulates the cardiomyogenic activity of the epicardial 
conditioned medium. 


Dynamic expression of Fstll after ischaemic injury 

During fetal development, endogenous Fstl1 is expressed throughout 
the myocardium of the primitive heart tube’*, but becomes restricted 
to the epicardium by mid-gestation (Fig. 2h). Epicardial expression 
persists throughout adulthood (Fig. 2i-k). Remarkably, Fstl1 
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cytokinesis analysed by double immunostaining for aurora B and Tnnt2 
(cardiomyocytes) (1, m). *P < 0.05. n, Schematic of collagen patch generation 
(reconstructed from ref. 26). o, Evaluation of mechanical properties of 
engineered patch, measured by atomic force microscopy. p, q, Suture 
procedure of patch over ischaemic myocardium. r, Echocardiography analysis 
normalized to individual pre-surgery baseline values. s, Absolute values of 
fractional shortening (FS%). t, Masson’s trichrome staining of the animal 
cohorts: sham (control, n = 10), infarcted mice without treatment (MI only, 
n = 8), MI treated with patch only (MI plus patch, n = 8), and infarcted 
animals treated with patch laden with epicardial conditioned media (MI plus 
patch plus CM, n = 8), 2 weeks after MI. *P < 0.05 compared to Sham control, 
{P< 0.05 compared to MI-only, and +P < 0.05 compared to MI plus patch 
(see Methods for details.) 


localization shifts strikingly following ischaemic injury, such that it 
becomes abundant in the myocardium (Fig. 2i-l) and absent in the 
epicardium and infarcted area (Fig. 2i, 1). 


Epicardial FSTLI promotes regeneration 


Prior studies showed that transient overexpression of Fstl1, either by 
myocardial transgenic expression (Fstl1-TG"*, Extended Data Fig. 2a, 
b) or systemic infusion, is anti-apoptotic following acute ischaemia- 
reperfusion (I/R)'*". In the context of permanent myocardial infarc- 
tion, myocardial Fstl1-TG mice did not recapitulate the effect of the 
patch containing EMC-media (Extended Data Fig. 2a-j). We assessed 
epicardial-hFSTL1 delivery by collagen patches loaded with 10 pg of 
recombinant bacterial-synthetized hFSTL1 per patch. Patches 
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Figure 2 | Fstl1 is an epicardial cardiogenic factor with dynamic expression 
after ischaemic injury. a, MS/MS spectrum of Fstl1. b-g, Fstll treatment of 
mCMs**“ cardiomyocytes measured by immunostaining («-actinin, green) 
(b, c), quantification of myocyte number (d), expression of cardiac-specific 
markers (e), cardiomyocytes with rhythmic calcium transient (f), and 
individual cardiomyocyte cell size (g). *P < 0.05 indicates statistically 
significantly different from control. h, Fstl1 immunostaining in the mouse 
embryonic heart (days E12.5, E15.5 and E17.5). Fstl1 (red), Wt1 (epicardial 
marker), o-actinin (myocardial marker), DAPI (nuclei). Fstll is expressed in 
epicardium (white arrowheads), no myocardium (yellow arrowhead). 

i) Expression shift of Fstl1 in the mouse heart after MI. Trichrome staining 
(upper), labels fibrosis (blue) Fstll immunohistochemistry (lower panels, 
brown). In injured hearts Fstl1 expression is depleted from the epicardium 
(brown) and upregulated in the myocardium. j-l, High resolution images of 
Fstl1 expression-shift after MI (see Methods for details). 


retained immune-detectable hFSTL1 up to 21 days in vitro, and 
28 days in vivo, the longest lengths of times tested (Extended Data 
Fig. 3a-f). Application of hFSTL1-loaded patches simultaneously 
with MI significantly improved survival (Fig. 3a) and sustained 
long-term recovery of cardiac function (Fig. 3b and Extended Data 
Table 2). Epicardial patch with FSTL1 also improved cardiac function 
when applied onto infarcted hearts of Fstll-TG mice (Fig. 3c); thus, 
myocardial overexpression of FSTL1 is insufficient for long-term 
recovery but epicardial reconstitution of recombinant FSTL1 is neces- 
sary to induce the beneficial effects. 

The improved cardiac function and survival following patch with 
FSTL1 treatment was accompanied by attenuated fibrosis (Fig. 3d, e, 
Extended Data Fig. 3g-j, Extended Data Table 2, and Supplementary 
Videos 3-5), increased vascularization of the patch and underlying 
myocardium at the border of the infarcted region (Fig. 3f-i), as 
reflected by the increased area occupied by vessels (Fig. 3f, g), 
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and the increased number of vessels (of any size) per area unit 
(Fig. 3h, i). Masson’s trichrome staining showed contiguous engraft- 
ment of the patch with FSTL1 onto the host myocardium and demon- 
strated migration of host cells into the patch including evidence 
of striated cells (green arrows, last two columns in Extended Data 
Fig. 3k). 

A similar recovery was found when FSTL1-loaded patches were 
grafted one week after I/R injury in the mouse, when cardiac function 
had substantially decreased (about 15% reduction in fractional short- 
ening, FS%). As is typical, cardiac function of untreated animals pro- 
gressively declined (22%, 20% and 16% fractional shortening at 1, 3 
and 5 weeks post-I/R). In contrast, the patch with FSTL1 cohort 
showed a nearly complete and stable recovery of fractional shortening 
(to 34% three weeks post-I/R) (Extended Data Fig. 4a—d and Extended 
Data Table 3), suggesting that epicardial-delivered FSTL] is sufficient 
to revert the loss of cardiac function after experimental MI. 


FSTLI induces cardiomyocyte proliferation in vivo 


Four weeks following MI, the patch with FSTL1 cohort showed evid- 
ence of striated myocytes («-actinin* cells) within the patch 
(Extended Data Fig. 5a-d). Cardiomyocytes in the border zone had 
undergone cell division (Extended Data Fig. 5e-h) by several inde- 
pendent criteria, including an increased number of double-positive 
a-actinin®, phospho-histone H3 (pH3)* cells (Fig. 3j-l; and 
Extended Data Fig. 5e-k), increased incidence of aurora B kinase 
localized to the midbody between «-actinin® cells (Fig. 3m, n and 
Supplementary Video 6), and increased incidence of cells that were 
double-positive for pH3 and the nuclear cardiomyocyte maker PCM1 
(ref. 16) (Fig. 30, p) relative to MI and patch-only cohorts (Extended 
Data Fig. 5l-r). Thus, epicardial FSTL1 delivery activates cardiomyo- 
cyte cell cycle entry and cytokinesis in vivo reminiscent of the in vitro 
results above. Proliferating cardiomyocytes were found only in the 
border zone and, to a lesser extent, the infarcted area (Extended Data 
Fig. 5s, t). Increased cardiomyocyte proliferation was also observed in 
the I/R injury model with delayed patch implantation (Extended Data 
Fig. 4e, f). Notably, FSTL1 did not diminish cardiomyocyte apoptosis, 
the extent of the infarcted area or the area at risk (hypoperfused area) 
acutely after MI; nor did it affect apoptosis or inflammation at day 4 
and day 8 post-MI (Extended Data Fig. 6). 

In contrast to patch with FSTL1 delivery, transgenic overexpression 
of Fstl1 (Fstll-TG mice) did not show any evidence of cardiomyocyte 
proliferation after MI (Extended Data Fig. 2k, 1), despite increased 
vascularization described previously’? (Extended Data Fig. 2m, n), 
indicating that epicardial-delivered FSTL1 might function differently 
than myocardial-expressed Fstl1. 

To distinguish whether the FSTL1-responsive cells arise from pre- 
existing myocytes (Myh6" cells) or de novo from a progenitor popu- 
lation, we heritably labelled Myh6* cardiomyocytes using a 4-OH- 
tamoxifen-inducible cre’” before injury (Fig. 3q). 4-OH tamoxifen 
injected into Myho"™®°""®, Rosa26”"© mice’ efficiently labelled 
pre-existing cardiomyocytes with eGFP before MI (Fig. 3r). Four 
weeks after patch engraftment, eGFP*, pH3* double-positive cells 
were visible in the infarct area and border zone (Fig. 3s—v), indicating 
that the proliferating cardiomyocytes expressed Myh6 before MI. We 
treated cardiomyocytes at different stages of differentiation with 
FSTLI in order to determine which stage(s) can respond by prolif- 
erating. Neither adult mouse cardiomyocytes (Extended Data 
Fig. 7a-f), neonatal rat cardiomyocytes (Extended Data Fig. 7g-j) 
nor cardiomyogenic progenitor cells (Lin, Scal", SP*)!* responded 
(Extended Data Fig. 7k—m). Of the cells tested, only immature cardi- 
omyocytes (mCMs**°) proliferated in response to FSTL1 (Fig. 4a-f). 

It remained paradoxical that neither the endogenous Fstll induced 
by MI nor myocardially overexpressed Fstl1 could induce a regen- 
erative response (Fig. 3 and Extended Data Fig. 2). Western blot 
analysis indicated that myocardially overexpressed Fstl1 (in neonatal 
rat ventricular myocytes, NRVC) migrates substantially slower in 
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Figure 3 | FSTL1 recapitulates the in vivo restorative effect of epicardial 
conditioned media in the engineered epicardial patch, and promotes 
cardiomyocyte proliferation. a, b, Survival (a) and kinetics of FS(%) 

(b) analyses after MI in the indicated treatments. c, Effect of epicardial hFSTL1 
patches on FS% in Fstll-TG mice. d-i, Masson’s trichrome staining 

(d), morphometric analysis by echocardiography (e), and vascularization 
analysis (f-i) 4 weeks after MI. *P < 0.05 compared to sham, {P < 0.05 vs MI 
only, and +P < 0.05 vs MI plus patch. j, Cross-sections covering infarct/patch 
area separated 250 ttm, 1-2 mm from apex used for cardiomyocytes 
proliferation analysis (k—p), 4 weeks after MI. k, m, 0, Co-staining of pH3 and 
a-actinin (k), midbody-localized aurora B kinase between a-actinin™ cells 


SDS-polyacryamide gels than does epicardially synthesized Fstl1 
(EMC), and that tunicamycin treatment eliminates the difference 
(Fig. 4g), suggesting cell-type specific glycosylation. The bacterially 
produced recombinant human FSTLI (as used in the patch) showed a 


faster migration consistent with less extensive glycosylation (Fig. 4h, 
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i). Direct comparison of recombinant human FSTL1 produced in 
bacterial versus mammalian cells (NSO-derived mouse myeloma cell 
line) revealed that mammalian-expressed FSTL1, but not bacterial 
FSTLI, protects mCMs'*© from HO -induced apoptosis (Fig. 4j), 
consistent with evidence of cardioprotection™ but not regeneration 
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Figure 4 | FSTL1 proliferative activity on early cardiomyocytes depends on 
the cells’ selective post-transcriptional FSTL1 modifications. a-f, FSTL1 
promotes proliferation of mCMs"*°, measured by EdU incorporation (a), pH3 
(b), and aurora B immunostaining (c), and quantified in d-f. g-i, Western blot 
analysis of Fstl1 secreted in cultured cardiomyocytes (myoFSTL1 CM) infected 
with Adeno-Fstll and in EMC (EMC CM) in the presence of tunicamycin 
(glycosylation inhibitor) (g), hFSTL1-V5 tagged expressed in AD-293 cells 
(h), and mammalian and bacterial-produced FSTL1 (i). Red arrows, 
glycosylated; black arrows, hypoglycosylated. j, Mammalian-produced FSTL1 


(Extended Data Fig. 2a—j) in Fstll TG mice. In contrast, bacterially 
synthesized human FSTL1 promotes mCMs'*“ proliferation, whereas 
human FSTL1 produced in NSO-derived cells or in NRVCs cannot 
stimulate proliferation of mCMs"°© (Fig. 4k-n). Thus, whether 
FSTLI induces cardioprotection versus proliferation correlates with 
cell source and might reflect post-translational modification. 


Epicardial FSTL1 in a preclinical swine model 

Epicardial delivery of FSTL1 was evaluated in the swine model of I/R 
injury. I/R decreased left ventricular ejection fraction (EF %) from ~50% 
before MI, as determined by magnetic resonance imaging (MRI), to 
~30% at 1 week after injury. Application of the patch with FSTL1 to 
the epicardium over the injured tissue at this time (1 week post-MI I/R) 
stimulated recovery of contractile function (to ~40% EF) in 2 weeks 
(3 weeks post-MI I/R) (Fig. 5a, b). The recovery remained stable for 
an additional 2 weeks, the longest time analysed, and was in contrast 
to the steady decline seen without treatment or following treatment with 
patch alone (Fig. 5b). FSTL1-treated pigs demonstrated the least scar size 
of all treatments, including the patch-only condition (see representative 
MRI images (Fig. 5c, d)). Examination of histological sections of tissues 
4weeks after patch implantation confirmed the limited fibrosis and 
showed integration of the patch into the host tissue (Fig. 5e). 
Cardiomyocytes in the border zone and ischaemic area of the patch with 
FSTLI treated hearts also had evident EdU labelling (Fig. 5i-m) and 
midbody-localized aurora B kinase (indicative of cytokinesis) (Fig. 5n). 
Vascular smooth muscle cells were also EdU~ suggestive of arteriogen- 
esis (Fig. 5g, h). Thus, the patch with FSTL1 appears therapeutic in the 
swine MI I/R model. 


Discussion 


Heart regeneration studies in zebrafish suggested that the epicardium 
is activated by injury to produce factors and cells that sustain cardiac 


attenuates HO, induced apoptosis, while bacterial-produced FSTL1 cannot. 
k, 1, Bacterially-produced FSTL1 promotes mCMs"°° EdU incorporation and 
aurora B positivity whereas mammalian-produced FSTL1 does not. 

m, n, Quantification of EdU incorporation in mCMs*°° treated with 
conditioned media of EMC and Fstl1-overexpressing NRVC (concentration 
normalized to Fstll content). *P < 0.05 indicates statistically different from 
control (see Methods for details). 0, Working model of FSTL1 in distinct 
cardiac compartments. 


function’’. Unlike lower vertebrate hearts, which are robustly regen- 
erative, the mammalian heart retains negligible regenerative potency 
in adulthood and, instead, sustains cardiomyocyte death and scarring 
following injury. Very little is known of the endogenous mechanisms 
that limit regeneration and the topic remains a subject of intense 
therapeutic interest and scientific debate”. Our data suggest a new 
view of epicardial function after injury in the mammalian heart. 
Rather than activation to support cardiac function, the loss of epicar- 
dial FSTL1 expression after injury, and the functional and anatomical 
recovery by reconstitution in an engineered biomaterial, indicate 
that ischaemic injury induces a maladaptive loss of FSTL1 in the 
epicardium. 

We sought to identify the cell population that proliferates in res- 
ponse to FSTL1. FSTL1 could not stimulate mature adult ventricular 
cardiomyocytes to synthesize DNA or divide, nor did it induce hyper- 
trophy (as can occur in response to mitogens) either at 48 h (Fig. 2g) 
or 4 weeks post-MI (Extended Data Fig. 50). In contrast, FSTL1 sti- 
mulated replication of newly emerging cardiomyocytes from mouse 
ESC cultures (Fig. 4a—f). FSTL1 did not enhance replication of either 
ESC-derived progenitors before the appearance of Myh6 (not shown) 
or a population of cardiac progenitors isolated from the adult murine 
heart (Extended Data Fig. 7k-m), suggesting that competence to 
respond to FSTL1 occurs transiently. Although at least some mono- 
nuclear adult cardiomyocytes can be induced to divide*’, the FSTL1- 
responsive cardiomyocytes in our experiments have even less mature 
sarcomeric and electrophysiological properties” (for example, auto- 
maticity, relatively high maximum diastolic potential, and slow action 
potential peak V,,,,) (Extended Data Fig. 1). The cells that respond to 
FSTL1 might overlap cells identified in an earlier analysis of infarcted 
hearts labelled with Myh6-cre, in which a minor population of cre- 
labelled cells were reported to divide and give rise new cardiomyo- 
cytes upon ischaemic injury~*. Whether the FSTL1-responsive cells 
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Figure 5 | Epicardial FSTL1 delivery activates cardiac regeneration in 
preclinical model of ischaemic heart injury. a—d, Time course MRI analysis of 
cardiac function in pigs. Functional analysis by measurement of ejection 
fraction (EF%) (a, b). Scar size at week 4 post-grafting (c, d). Green lines 
highlight scar perimeter. e-n, Analysis at week 4 post-grafting. Masson’s 


reflect resident Myh6* recruited upon injury for example”, or derive 
from de-differentiation™ (thus recapitulating the zebrafish model”) is 
an interesting question whose resolution will depend on improved 
method to identify and/or isolate such cells. 

Myocardial Fstll induced by MI cannot promote a regenerative 
response, either basally or when abundantly overexpressed transge- 
nically in cardiomyocytes (Extended Data Fig. 2). However, trans- 
genic myocardial Fstll is cardioprotective post-MI’*. Direct 
comparison of FSTL1 overexpressed in cardiomyocytes versus the 
epicardial (EMC) protein revealed tunicamycin-sensitive differences 
in SDS-PAGE mobility (Fig. 4), consistent with the possibility of 
differential glycosylation (or other post-translational modification) 
depending on the cell in which it is expressed. We infer from these 
data that native epicardial and myocardial FSTL1 have analogous 
differences in glycan structure that affect their function. It will be 
important to determine the structure of the glycans, as well as elucid- 
ate how post-translation modifications dictate whether FSTL1 pro- 
motes anti-apoptosis (myocardial) or cardiomyocyte proliferation 
(EMC and bacterially produced). 

These studies identified FSTL1 as a regenerative factor that is 
normally present in healthy epicardium, but lost upon MI, suggesting 
a mechanism whereby injury maladaptively diminishes the regenerat- 
ive potency of the mammalian heart. Reconstitution of FSTL1 by an 
engineered epicardial biomaterial improved cardiac function in the 
mouse MI, mouse MI I/R and preclinical swine MI I/R models with 
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trichrome staining (e). EdU incorporation (newly synthesized DNA) in the 
vascular smooth muscle cells (f-h). White line demarcates patch and host 
tissue. i-n, EdU (i-m) incorporation and aurora B kinase positivity (m) in 
cardiomyocytes at week-4 post-grafting (see Methods for details). 


evidence of cardiomyocyte regeneration amenable to clinical 
translation. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Data reporting. No statistical methods were used to predetermine sample size. 
Cell preparation. 

Progenitor cells Scal* ,Myh6~ cardiomyocyte progenitors were obtained by the 
Schneider laboratory as described '* (Extended Data Fig. 7k-m). 

Epicardial mesothelial cells (EMCs) These were maintained in DMEM with 
10% FBS and antibiotics/antimycotic as described’’. EMCs are stably transduced 
with H2B-mCherry lentivirus for nuclei labelling (Figs 1 and 4). 

Mouse embryonic stem cell-derived cardiomyocytes (mCMs"*°) A stable mouse 
ESC line for drug resistance selection of cardiomyocytes (Myh6-Puro’;Rex-Blast’) 
was generated by lentiviral transduction and blasticidin selection, similarly to our 
previously reported human line”* 

mCMs**° These were obtained by differentiation of Myh6-Puro’;Rex-Blast" 
mESCs in a differentiation media containing: Iscove’s Modified Dulbecco 
Media (IMDM) supplemented with 10% FBS, 2mM__ glutamine, 
45X10°*M monothioglycerol, 0.5mM ascorbic acid, 200 pg ml”! transfer- 
rin (Roche), 5% protein-free hybridoma media (PFHM-II, Invitrogen) and 
antibiotics/antimycotic as embryoid bodies (EBs) until day 4 and plated onto 
adherent cell culture plate until 9, one day after the onset of spontaneous 
beating. To purify Myh6* cardiomyocytes, puromycin was added at differ- 
entiation day 9 for 24h. Subsequently cells were trypsinized and plated as 
monolayer cardiomyocytes. Conditioned media and FSTL1 treatments were 
typically performed 24h after monolayer plating. The length of the treat- 
ments is indicated in each figure legends (Figs 1, 2, 4 and Extended 
Data Fig. 1). 

AD-293 These cells were directly purchased from Stratagene avoiding misi- 
dentification, and cultured in DMEM media with 10% FBS and with pen/strep. 
It’s used for its high transfection efficiency and yield of recombinant proteins 
(Fig. 4h). 

EMCs and Myh6-Puro’;Rex-Blast’ mESCs, and AD-293 cells are quarterly 
tested for mycoplasma contamination when in use. 

Embryonic cardiomyocytes We used fluorescence activated cell sorting (FACS) 
to purify cardiomyocytes from Tnt-cre;Rosa26"""@* (C57BL/6J and ICR 
mixed background) hearts from e12.5 embryos. Hearts were dissociated 
collagenase IV digestion and GFP™ cells for FACS purification. The GFP* cells 
were cultured and confirmed to be cardiomyocytes by their expression of 
the cardiomyocyte specific markers alpha actinin (ACTN2) and cardiac 
troponin T (TNNT2). They were rhythmically beating when cultured in vitro 
(Fig. 1j-m). 

Neonatal rat ventricular cardiomyocytes (NRVCs) These cells were isolated 
with the neonatal rat cardiomyocyte isolation kit (Cellutron) and cultured at 
37°C with 5% CO». In brief, ventricles were dissected from 1-2-day-old 
Hsd:s.d. rats (Sprague Dawley), then digested five times for 15 min each with 
the enzyme cocktail at 37°C. Cells were pooled, pre-plated for 90 min on an 
uncoated cell culture dish to remove fibroblasts, and plated on 1% gelatin-coated 
cell culture plastic dishes in high-serum media (DME/F12 [1:1], 0.2% BSA, 3 mM 
sodium-pyruvate, 0.1 mM ascorbic acid, 4 mg 1”! transferrin, 2mM L-glutamine, 
and 5mgl' ciprofloxacin supplemented with 10% horse serum and 5% fetal 
bovine serum (FBS)) at 3 X 10° cells per cm”. After 24h, media was changed to 
low-serum medium (same but with 0.25% FCS) and cells cultured until use 
(Fig. 4g, m, n and Extended Data Fig. 7g-)). 

Adult mouse cardiomyocytes These were isolated from 3month old 
Myho™®2e"ER Rosa26"""". C57BL/6J mice as previously published”. 
Briefly, mice were anesthetized with pentobarbital sodium (100 mg per kg 
IP). The heart was removed and retrograde perfused at 37°C with a Ca*’- 
free solution (in mM, 120NaCl, 14.7KCl, 0.6 KH,PO,, 0.6Na,HPO,, 
1.2 MgSO4-7H20, 4.6 NaHCO3, 10 Na-HEPES, 30 taurine, 10 BDM, 5.5 glu- 
cose) followed by enzymatic digestion with collagenase. Ventricles were cut 
into small pieces and further digested. Stop buffer (Ca**-free solution, 
12.5 uM CaCl, 10% bovine calf serum) was added and the cell suspension 
was centrifuged at 40g for 3min. Myocytes were resuspended in stop buffer 
in increasing CaCl, concentrations until 1 mM was achieved. Cells were then 
resuspended in MEM, 5% bovine calf serum, 10mM BDM, 2mM 
L-glutamine and added to the collagen solution, pre-polymerization 
(250,000 cells perml or per patch). Following collagen gelation and plastic 
compression, cellular patches were cultured in aforementioned (plating) 
media overnight and then transferred into culture media: MEM, 1mgml'! 
bovine serum albumin, 25 uM blebbistatin, 2mM L-glutamine, in presence or 
absence of recombinant FSTL1 (AVISCERA BIOSCIENCE, 10 ngml_'). At 
day 7, fluorescent ubiquitination-based cell-cycle indicator (FUCCI, Premo 
FUCCI Cell Cycle Sensor, Life Technologies, US) assay was conducted on 
the 3D culture specimens as previously described*. In this assay, G1 and 
S/G2/M cells emit red and green fluorescence, respectively. The volume of 


Premo geminin-GFP and Premo Cdtl-RFP reagent were calculated using 
the equation below: 


number of cells x PPC 
1x 108 


where the number of cells is the estimated total number of cells at the time 
of cell labelling (equal to CM seeding density, PPC (particles per cell) is the 
number of viral particles per cell (40 in this assay), and 1X 10° is the 
number of viral particles per ml of the reagent. The volumes of reagents 
calculated above were directly added to the cellular patches in complete cell 
medium, mixed gently, and incubated overnight in the culture incubator 
(216h). Patch samples were imaged using a conventional fluorescence 
microscope, using GFP and RFP filter sets (Extended Data Fig. 7a-f). 
Co-culture experiments. mCMs"*“ are co-cultured with H2B-mCherry EMCs 
for 4 days and visualized by o-actinin immunofluorescent staining and H2B- 
mCherry fluorescence (Fig. 1a, b), cardiomyocyte counting (Fig. 1c, n = 3), and 
cardiogenic gene expression normalized to Gapdh gene expression (Fig. 1d n = 3) 
Epicardial conditioned media. Rat epicardial mesothelial cells (EMC) condi- 
tioned media EMC” cells were cultured in 10% FBS DMEM with penicillin/ 
streptromycin until confluent (~1 x 10°cm~*), then washed with PBS three 
times and media is changed to serum free DMEM with penicillin/streptromycin 
without phenol red and cultured for 2 additional days before the media was 
collected as conditioned media (20 ml of media is added for conditioning and 
18 mlis collected after 2 days). Collected media was filtered through 0.22 um pore 
membrane (Millipore). Control conditioned media were prepared the same way 
but without EMC cells (Fig. le-i, n-t). 

Neonatal rat ventricular cardiomyocytes (NRVCs) conditioned media. NRVC 
were infected with adenovirus expressing un-tagged mouse Fstl1 at MOI 50. 24h 
post-infection culture media was replaced by serum free media (DMEM/F12 with 
penicillin/streptromycin). The media was conditioned with the infected NRVC 
and EMC cells for 24h (Fig. 4m, n). 

mCMs*°° were treated with control and EMC-conditioned media for 8 days 
before o-actinin immunofluorescent staining (Fig. le, f), cardiomyocyte counting 
(Fig. 1g, n = 3), analysis of cardiogenic gene expression normalized to Gapdh 
gene expression (Fig. 1h. n = 3) and quantification of the number of cardiomyo- 
cytes with rhythmic calcium transient measured automatically using a Kinetic 
Imaging Cytometer (Vala Sciences) (Fig. li, n = 3). 

mCMs*°° were treated with serial dilutions of conditioned media of EMC and 
Fstl1-overexpressing NRVC for 24 h with 10 ug ml’ EdU, and stained for o-acti- 
nin and EdU (Fig. 4m, n, n = 5). The concentrations of the conditioned media are 
normalized to amount of Fstl1 expression by western blot. 

Adult mouse EPDC conditioned media This was generated in the Zhou labor- 
atory*. Briefly, eight-week old adult Wt17?®!* ;Rosa26"""°* hearts mice in 
C57BL/6J and ICR mixed background were injected orally with 4 mg tamoxifen 
by gavage, four to five oral injections were administered during a two-week 
period. Myocardial infarction was then induced by ligation of left anterior des- 
cending coronary artery on (11 weeks old) adult mice. One week after injury, we 
collected Wt1?®!*/* ;Rosa26""'"* hearts, which were then digested with col- 
lagenase IV into single cells. Digestion solution was made by adding 4ml 1% 
collagenase IV and 1 ml 2.5% trypsin into 44.5 ml Hanks’ balanced salt solution, 
and supplemented with 0.5 ml chicken serum and 0.5 ml horse serum. Cells were 
resuspended in Hank’s balanced salt solution, 4 ml digestion solution was added 
to each tube and rocked gently in 37°C shaker for 6 min. After removing the 
supernatant containing dissociated cells, we added another 4 ml digestion solu- 
tion to repeat the digestion 6 times. After final digestion, we filtrated the cells 
through 70 jum filter and pellet cells by centrifuging at 200 g for 5 min at 4 °C. Cells 
were then re-suspended by Hanks’ balanced salt solution for FACS isolation. 
Dissociated cells from GFP- hearts were used as a control for gate setting in 
FACS. GFP* cells (epicardium-derived cells, EPDCs) were isolated from GFP* 
wei 2RT2* Rosa26""™"“~ hearts by FACS and these GFP+ purified popula- 
tions were confirmed to be GFP+ cells under fluorescence microscope (Fig. 1). 
FSTL1 expression (determined by PCR) was restored in cultured GFP* EDPCs. 
Complete conditioned media from EPDCs was then added to embryonic cardi- 
omyocytes culture for 48 h before assay for proliferation (Fig. 1k-m, n = 5). 
MTT Assays. Proliferation of cardiomyocytes treated with conditional medium 
was measured by MTT assay using Celltiter 96 Aqueous One solution (Promega) 
as previously described’. After adding the Celltiter 96 Aqueous One reagent into 
the cell culture medium, we incubate the plate at 37 °C for 3-4h, and then record 
the absorbance at 490 nm using a 96-well plate reader. Absorbance at 490 nm is 
tightly correlated with cell number. The MTT readout on the y-axis, labelled MTT 
assay (A490), thus reflects the relative number of cells from each well between 
groups of treatment (Fig. 1k). Boiling of conditioned media abolished the 


Volume (ml) 
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growth-promoting effects (Fig. 1k), suggesting a proteinaceous nature of the 
effective components. 

Recombinant FSTL1 was purchased from AVISCERA BIOSCIENCE 
(00347-02-100, produced in E. coli) and R&D system (1694-FN-050, produced 
in mouse myeloma cell line, NSOderived). 

mCMs'*“ were treated with bacteria-synthetized recombinant human FSTL1 
(10 ng ml’) for 8 days with media change every 2 days, before a-actinin immu- 
nofluorescent staining (Fig. 2b, c), cardiomyocyte counting (Fig. 2d, n = 8), ana- 
lysis of cardiogenic gene expression normalized to Gapdh gene expression 
(Fig. 2e, n = 3) and quantification of the number of cardiomyocytes with rhyth- 
mic calcium transient measured automatically using a Kinetic Imaging 
Cytometer (Vala Sciences) (Fig. 2f, n = 6). mCMs*°° were treated with bac- 
teria-synthetized recombinant human FSTLI (6.25-50 ng ml’) for 2 days before 
measurement of individual cardiomyocyte cell size (in pixels) (Fig. 2g, n = 5). 

mCMs*°C were stimulated with 6.25, 12.5, 25 and 50 ng ml! of bacteria pro- 
duced FSTLI1 for 24h with 10 pg ml ' EdU, and stained for o.-actinin and EdU 
(Fig. 4a, d, n = 5). mCMs"*° were stimulated with 10 ng ml _' FSTLI for 48 h and 
stained for o-actinin and pH3 (Fig. 4b, e, n = 5). mCMs*°° were stimulated with 
25, 100, 200 ng ml | FSTLI for 48 h, and stained for «-actinin (red) and aurora B 
(Fig. 4c, f, 2 = 5) 

mCMs*°° were stimulated with 10nMH,0,, and 10 ng ml | bacteria and 
mammalian produced FSTL1 for 24h, and staining for o-actinin and TUNEL 
for cell death (Fig. 4j, n = 5). 

mCMs"*° were stimulated with 10 ng ml! of bacteria and mammalian pro- 

duced FSTL1 for 24h with 10 jg ml“! EdU, and stained for a-actinin and EdU 
(Fig. 4k, n = 5), and o-actinin and aurora B (Fig. 41, n = 5) 
FSTLI overexpression and western blot. AD-293 cells were transiently trans- 
fected with human FSTL1 plasmid (GE Dharmacon, ID: ccsbBroad304_02639 
pLX304-Blast-V5-FSTL1) using lipofectamine 2000 (mocked transfection was 
done with lipofectamine and no plasmid). 48 h post-transfection serum contain- 
ing media was replaced by serum free DMEM and incubated with the cells for 
24h. Tunicamycin was used at 2 pg ml’. Conditioned media from tunicamycin 
samples was collected during 16h (cells looked healthy). Conditioned media was 
spun at 400g for 7min and then concentrated approximately 20 times using 
Microcon-10 kDa cut off columns (Millipore). Samples were combined 1 to 1 
ratio with 2 X SDS sample buffer containing protease inhibitor, DTT and 5mM 
EDTA, boiled for 10 min at 95 °C and run in a 4-15% acrylamide Mini-Protean 
TGX gel, transferred to nitrocellulose membrane and incubated with anti-V5 
primary antibody MAB 15253 (Pierce) 1:1,000 dilution and anti-mouse 800 nm 
conjugated secondary antibody at 1:10,000 dilution (Odyssey), and scanned using 
the Odyssey Clx Imager (Fig. 4h). 

Neonatal rat ventricular cardiomyocytes were infected with adenovirus expres- 
sing un-tagged mouse Fstll at MOI 50. 24h post-infection culture media was 
replaced by serum-free media. Serum free DMEM/F12 pen/strep media was 
conditioned with the infected NRVC and EMC cells for 24h. Tunicamycin 
was used at 1 tgml~* and media was conditioned for 16h. Conditioned media 
was spun at 400g 7 min and then concentrated using Microcon-10 kDa cut off 
columns (Millipore). Samples were combined 1 to 1 ratio with 2 X SDS sample 
buffer containing protease inhibitor, DTT and 5mM EDTA, boiled 10 min at 
95°C and run in Any KD Mini-Protean TGX gel, transferred to nitrocellulose 
membrane and incubated with anti-FSTL1 MAB1694 (R&D) primary antibody 
1:500 dilution and anti-rat 800 nm conjugated secondary antibody at 1:10,000 
dilution (Odyssey), and scanned using Odyssey Clx Imager. Blocking and anti- 
body incubation was done in Odyssey blocker. The western blot for recombinant 
FSTL1 (100 ng each) was performed the same way (Fig. 4g, i). 

RNA extraction and qRT-PCR. Total RNA was extracted with TRIzol 
(Invitrogen) and reverse transcribed to cDNA with QuantiTect Reverse 
Transcription Kit (Qiagen) according to the manufacturer’s instructions. 
cDNA samples synthesized from 100 ng of total RNA were subjected to RT- 
qPCR with LightCycler 480 SYBR Green I Master kit (Roche) performed with 
LightCycler 480 Real-Time PCR System (Roche) (Figs 1d, h and 2e and Extended 
Data Fig. 7b-d). Primer sequences are listed in Supplementary Table 1. 

LC-MS/MS analysis of conditioned-media. First, Tris(2-carboxyethyl)pho- 
sphine (TCEP) was added into 1 ml of conditional media to 10mM and the 
protein sample was reduced at 37 °C for 30 min. Then iodoacetamide was added 
to 20 mM and the solution was alkylated at 37°C for 40 min in the dark. Mass 
spectrometry grade of trypsin (Promega) was then added to the solution as 1:100 
ratio. After overnight digestion at 37 °C, the sample was then desalted using a 
SepPack cartridge, dried using a SpeedVac and re-suspended in 100 pl of 5% 
formic acid. The resulting peptides were analysed on-line by an LC-MS/MS 
system, which consisted of a Michrom HPLC, a 15cm Michrom Magic C18 
column, a low flow ADVANCED Michrom MS source, and a LTQ-Orbitrap 
XL (Thermo Scientific, Waltham, MA). A 120-min gradient of 0-30% B (0.1% 
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formic acid, 100% acetonitrile) was used to separate the peptides, and the total LC 
time was 141 min. The LTQ-Orbitrap XL was set to scan the precursors in the 
Orbitrap at a resolution of 60,000, followed by data-dependent MS/MS of the top 
4 precursors. 

The raw LC-MS/MS data was then submitted to Sorcerer Enterprise 

(Sage-N Research Inc.) for protein identification against the IPI rat protein data- 
base, which contains semi-tryptic peptide sequences with the allowance of up to 2 
missed cleavages and precursor mass tolerance of 50.0 p.p.m. A molecular mass of 
57 Da was added to all cysteines to account for carboxyamidomethylation. 
Differential search includes 16 Da for methionine oxidation. The search results 
were viewed, sorted, filtered, and statically analysed using PeptideProphet 
and ProteinProphet (ISB). The minimum trans-proteomic pipeline (TPP) 
probability score for proteins and peptides was set to 0.95, respectively, to assure 
a TPP error rate of lower than 0.01. The example MS/MS spectrum 
R.GLCVDALIELSDENADWK.L was identified as Fstl1 (Fig. 2a). Peptide prob- 
ability = 1.0, Xcorr = 6.276, delta Cn = 0.471. 
Automated in vitro cell proliferation and cell death assay. Cells (mCMs"*© and 
NRVC) were incubated with EdU (details of dosage and length of exposure are 
specified in figure legends) in a 384-well plate format, and were fixed for 2 h in 4% 
PFA, washed in PBS and stained for EdU using Click-it EdU assay kit 
(Life Technologies). The cells were then washed in PBS, immunostained with 
an o-actinin antibody (Sigma, A7811, 1:500) to identify cardiomyocytes and 
stained with DAPI (4’,6-diamidino-2-phenylindole, 1:10,000) to identify nuclei. 
The plates were then imaged using InCell 1000 system (GE Healthcare) and 
automatically analysed in Developer Toolbox (GE Healthcare) as described*’. 
Ratios of EdU*'c-actinin* nuclei and «-actinin* nuclei were generated for the 
percentage of cardiomyocyte incorporated EdU in the chromosomal DNA. 

Similarly, cells (mCMs*S© and NRVC) in 384-well plate format were fixed for 
2h in 4% PFA, washed in PBS, and were immunostained with pH3 antibody 
(Millipore 06-570, 1:200) for nuclei in mitosis, or aurora B (Millipore 04-1036, 
1:200) for cytokinesis, or TUNEL (Roche) for cell death, and «-actinin antibody 
(Sigma, A7811, 1:500) for cardiomyocytes and DAPI (1:10,000) for nuclei. The 
same imaging and analysis were done for pH3 staining as the EdU assays, and the 
aurora B”, o-actinin* double positive cells were manually counted. The percen- 
tages of pH3*, a-actinin*® double positive nuclei, aurora B*, a-actinin® double 
positive cells, and TUNEL", «-actinin* double positive nuclei relative to the 
total number of «-actinin* cell nuclei were calculated to determine the percen- 
tages of cardiomyocytes undergoing mitosis, cytokinesis and apoptosis, 
respectively.Calcium Imaging. Contractile calcium transients were recorded 
using a Kinetic Image Cytometer (KIC, Vala Sciences) using Fluo4 NW calcium 
indicator (Life Science). Data was processed using Cyteseer software containing 
the KIC analysis package (Vala Sciences) as described”. 

Compressed collagen gel for use as an engineered epicardial patch. Highly 
hydrated collagen gels, used as cardiac patch in this study, were produced by 
adding 1.1 ml 1X DMEM (Sigma, MO, US) to 0.9 ml of sterile rat tail type I 
collagen solution in acetic acid (3.84 mg ml 7 Millipore, MA, US). The resulting 
2 ml collagen-DMEM mixture was mixed well and neutralized with 0.1 M NaOH 
(~50 il). The entire process was conducted on ice to avoid premature gelation of 
collagen. In the case of patches containing epicardial factors, the EMC culture 
media was collected as above and 0.6 ml of that was mixed with 0.5 ml DMEM. 
The collagen solution (0.9 ml) was then distributed into the wells of 24-well plates 
(15.6 mm in diameter) and placed in a tissue culture incubator for 30 min at 37 °C 
for polymerization. For pig studies, 6.8 ml of collagen was mixed with 8.2 ml 
DMEM to obtain a 15 ml solution that was then cast into a 6-cm Petri dish 
(area = 28.3cm”). Plastic compression was performed as described prev- 
iously****. Briefly, as cast, highly hydrated collagen gels (at ~ 0.9 and 15 ml 
volumes for the mice and swine study, respectively) underwent unconfined com- 
pression via application of a static compressive stress of ~ 1,400 Pa for 5 min (see 
refs 33, 35 for details), resulting in ~ 98-99% volume reduction (Fig. 1n). The 
elastic modulus of the compressed collagen, aimed to approximate that of the 
embryonic epicardium which is optimal for contractility of immature cardiomyo- 
cytes (see ref. 36), was assessed by atomic force microscopy (AFM) in nano- 
indentation mode, using a force trigger that resulted in a minimal local strain 
of less than 10% (indentation of ~ 100 nm) to minimize the effect of substrate- 
related artefacts. A custom-made flat AFM tip was manufactured using focused 
ion beam milling and used to probe the stiffness of the gels by scanning areas of 
90 um X 90 im. Histogram of the distribution of measured micro stiffness of the 
patch is compared with the range of elasticity reported for common scaffolding 
biomaterials”, and previously described*® optimal range of elasticity to maximize 
myocyte contractility (Fig. lo, n = 3). 

Myocardial infarction and application of the epicardial patch. Permanent LAD 
occlusion (MI) Male 10-12-week-old C57BL/6J mice were purchased from 
Jackson Laboratories (Bar Harbour, ME, USA). Fstll-TG mice used in MI 
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experiments are C57BL6 background, female and male mice aged 12-15 weeks 
old. Mice were anaesthetized using an isoflurane inhalational chamber, endotra- 
cheally intubated using a 22-gauge angiocatheter (Becton, Dickinson Inc., Sandy, 
Utah) and connected to a small animal volume-control ventilator (Harvard 
Apparatus, Holliston, MA). A left thoracotomy was performed via the fourth 
intercostal space and the lungs retracted to expose the heart. After opening the 
pericardium, a 7-0 suture was placed to occlude the left anterior descending artery 
(LAD) ~ 2mm below the edge of the left atrium. Ligation was considered suc- 
cessful when the LV wall turned pale (Fig. 1p). In the case of experimental groups 
treated with patch, immediately after the ligation, prepared collagen patch was 
sutured (at two points) onto the surface of ischaemic myocardium (Fig. 1g). The 
patch size used was ~ one-third of the 15.6 mm-diameter collagen gel. Animals 
were kept on a heating pad until they recovered. Another group of mice under- 
went sham ligation; they had a similar surgical procedure without LAD ligation. 
A minimum number of n = 8 was used in each study group. 

Ischaemia reperfusion (I/R) Male C57/BL6, aged 10 to 11 weeks, were anaes- 
thetized and intubated as described above. A left lateral thoracotomy was then 
performed. Pericardium was gently pulled off and an 8-0 Nylon suture (Ethicon, 
Inc. Johnson & Johnson Co., USA) was used to ligate the left anterior descending 
coronary artery against a PE10 tubing, which was removed after 30 min occlusion. 
Successful performance of coronary artery occlusion was verified by visual 
inspection (by noting the development of a pale colour in the distal myocardium 
upon ligation). The chest was then closed using 7-0 sutures around adjacent ribs, 
and the skin was closed with 6-0 suture. Buprenorphine was administered sub- 
cutaneously for a minimum of 1 day at BID dosing. For the animal group treated 
with patch, a second thoracotomy was performed one week post the incidence of 
I/R and the prepared collagen patch was sutured (at two points) onto the surface 
of ischaemic myocardium. Sham-operated controls consisted of age-matched 
mice that underwent identical surgical procedures (two thoracotomies) with 
the exception of LAD ligation (Extended Data Fig. 4). 

Echocardiography. In vivo heart function was evaluated by echocardiography at 
2 weeks (Figs Ir, s and 3b, c and Extended Data Fig. 2 h-j), 4 weeks (Fig. 3b, c, e 
and Extended Data Fig. 2h-j), and 2 and 3 months (Fig. 3b) after LAD ligation. 
Two-dimensional (2D) analysis was performed on mice using a GE Vivid 7 
ultrasound platform (GE Health Care, Milwaukee, WI) equipped with 13 MHz 
transducer. The mice were sedated with isoflurane (100 mg per kg, inhalation), 
and the chest was shaved. The mice were placed on a heated platform in the 
supine or left lateral decubitus position to facilitate echocardiography. 2D clips 
and M-mode images were recorded in a short axis view from the mid-left ventricle 
at the tips of the papillary muscles. LV internal diameter (LVID) and posterior 
wall thickness (LVPW) were measured both at end diastolic and systolic. 
Fractional shortening (FS, %) and ejection fraction (EF, %, via extrapolation of 
2D data) were calculated from LV dimensions in the 2D short axis view. A 
minimum number (7) of 8 mice per experimental group was used for the echo 
evaluations. Measurements were performed by two independent groups in a blind 
manner. In ischaemia reperfusion study, in vivo heart function was evaluated pre- 
surgery (baseline), 1 week after the incidence of I/R, and two and four weeks post- 
implantation (Extended Data Fig. 4a—d). 

In vivo delayed-enhanced magnetic resonance imaging (DEMRI). To prepare 
for scanning, induction of anaesthesia was accomplished with 2% and maintained 
with 1.25-1.5% isoflurane with monitoring of the respiratory rate. ECG leads 
were inserted subcutaneously to monitor the heart rate while the body temper- 
ature was maintained at 37 °C. Using 3T GE Signa Excite clinical scanner with a 
dedicated mouse coil (Rapid MR International, Germany), functional parameters 
were recorded on weeks 1 and 4 after treatment. The following sequences were 
performed for MRI acquisitions: (1) DEMRI was performed following i-p. injec- 
tion of 0.2mmol per kg gadopentetate dimeglumine (Magnevist, Berlex 
Laboratories) using gated fGRE-IR sequences with FOV 3.4cm, slice thickness 
0.9 mm, matrix 128 X 128, TE 5 ms, TI 150-240 ms, and FA 60°; and (2) cardiac 
MRI of volumes were performed using fSPGR with FOV 7 cm, slice thickness 
0.9 mm, matrix 256 X 256, TE 5.5 ms, and FA 30. Coronal and axial scout images 
were used to position a 2-dimensional imaging plane along the short axis of the 
left ventricular (LV) cavity (Extended Data Fig. 3h-j). 

Histology, immunohistochemistry and immunofluorescent staining. 
Histological analysis (Mason’s trichrome staining) was performed according to 
standard protocols for paraffin embedded samples. For immunohistochemistry 
and immunofluorescent staining, embedded hearts were sectioned at a thickness 
of 741m, unless described otherwise. Antibodies used were as follows: 1:200 
a-actinin (Sigma, A7811), 1:300 %-smooth muscle actin (Sigma A2547) 1:100 
phospho-Histone3 (rabbit Millipore 06-570), 1:300 phospho-Histone3 (mouse 
Abcam ab14955) 1:100 WT1 (Abcam, ab15249), 1:100 (Fig. 1l-m) and 1:250 
(Figs 3m, n and 5n) aurora B (Millipore 04-1036), 1:100 Tnnt2 (DSHB, Ct3), 
1:100 Tnni3 (Abcam, ab56357), 1:200 PCM1 (Sigma-Aldrich HPA023370), 1:200 


FSTL1 (R&D MAB17381). At least 5 sections per heart were used per staining 
for Mason’s Trichrome staining and 3 sections per heart per staining for 
immunohistochemistry and immunofluorescent staining, respectively. HRP 
Anti-rat secondary antibody (Jackson ImmunoResearch 712-036-153, 1:500) 
was used for immunohistochemistry, and respective fluorescent secondary anti- 
bodies (Life Technologies 1:200) were used for immunofluorescent staining. The 
Trichrome staining and immunohistochemistry images were taken using an 
upright Zeiss microscope and dissection scopes. The fluorescent images were 
taken using Apotome Optical Sectioning (Zeiss). An inclusion criterion for the 
patch engraftment was that the patch covered > 70% of the infarct (controlled 
by histology). TUNEL assay (Roche 11684795910).and EdU assay (Life 
Technologies C10337) were performed as instructed. 

Lineage tracing experiments. Epicardial lineage labelling was achieved by oral 
delivery of tamoxifen (4 mg) in eight-week old Wt17"*!”* ;Rosa26""”’ mice 
with C57BL/6J] and ICR mixed background (delivered 6 times for duration of 
3 weeks and stopped 1 week before MI). Hearts were collected at 2 weeks after 
MI. Immunostaining of RFP for Wt linage cells, Fstl1 and Tnni3 shows that Fstl1 
is absent in epicardial cells and their derivatives, but abundant in the myocardium 
after MI (Fig. 21). 

Cardiomyocyte lineage labelling was achieved by injecting 4-OH tamoxifen 
intraperitoneally into eight-week old Myh6”"®®?"§®:Rosq26~“"" mice!” of 
C57BL6 background at a dose of 20 mg per kg per day for 2 weeks, and stopped 
1 week before havesting cardiomyocytes (Extended Data Fig. 7a-f), or MI opera- 
tion and patch grafting. 4 weeks after MI, the animals were collect for immuno- 
staining (Fig. 3q-v). 

TTC staining. At day 2 post MI/patch treatment, the mouse hearts from all four 
groups were harvested and sectioned perpendicularly to the long axis into four 
sections (approximately 2 mm thick). The sections were placed in the wells of a 
12-well cell culture plate and incubated with 1% 2,3,5-triphenyltetrazolium chlor- 
ide (TTC, Sigma-Aldrich) solution for 15 min at 37 °C. Subsequently section were 
washed with PBS and visualized using a stereomicroscope and photographed with 
a digital camera (Extended Data Fig. 6a, b). 

Vessel counting. Blood vessel density parameters were measured from histolo- 
gical sections of heart samples stained for von Willebrand factor (vWF) as a 
marker of endothelial cells in the vessel wall. Up to 60 sections were analysed 
for each treatment group (4 mice in each group). Analysis was performed using 
ImageJ to calculate: (1) the total luminal area of blood vessels, and (2) the number 
of vessels that stained + for the vWF. In each case, a histogram of the vessel 
parameters as a fraction of total surface area analysed was obtained and the 
mid-values plotted for each treatment group. Statistical significance (P< 0.05) 
of the differences from sham group was determined by one-tailed ANOVA 
(Fig. 3f-i). 

Cardiomyocyte proliferation quantification in vivo. Data collected from 5-7 
hearts in each group (7 for MI plus patch with FSTL1, 5 for Sham, MI-only and 
MI plus patch) with 3 different cross-sections (each section covered the infarct, 
patch, and separated by 250,1m, between 1-2mm from the apex) counted 
exhaustively for total pH3*/c-actinin*, aurora B*/o-actinin*, and pH3*/ 
PCMI1" cells in each section, and normalized to myocardium area quantified 
by trichrome staining of immediate adjacent section (Fig. 5j-p, and Extended 
Data Fig. 5). 

Enzyme-linked immunosorbent assay. In order to assess the FSTL1 retention 
within the engineered patch system in vitro, collagen scaffolds laden with FSTL1 
(5 ug ml‘) were immersed in PBS and shaken for various times (0, 12h, 1 day, 
and 21 days) at 4 °C and the FSTL1 concentration was determined using Enzyme- 
linked Immunosorbent Assay kit (USCN Life Science, Inc., Houston, USA). The 
detection limit for this technique was 0.50 ng ml. Scaffolds were pretreated with 
1mgml * collagenase type I (Sigma Aldrich, MO, US) and 5 mg ml * hyalur- 
onidase (Sigma Aldrich, MO, US) dissolved in phosphate buffered saline for 
5 min followed by centrifugation at 5,000 g for 20 min. 

Aliquots of 100 l of the collected samples were added to the 96-well plates and 
incubated for 2h at 37 °C. Then, 100 ul of the prepared detection reagent A were 
added to the wells followed by 1 h incubation at same temperature. After aspira- 
tion and washing 3 times, 100 1l of the prepared detection reagent B was added to 
the wells and incubated for 30 min at 37 °C. After aspiration and washing 5 times, 
90 ul of substrate solution was added to the wells following by incubation for 
25min at 37°C. 50uL of stopping solution was added to the wells and the 
absorbance of each well was read at 450 nm, immediately. The concentration of 
FSTL1 was defined using standard curve of the standard solutions. The test was 
performed 4 times (Extended Data Fig. 3a). 

Application of the patch in a swine model of ischaemia-reperfusion. The swine 
study was performed by inflation of a percutaneous coronary angioplasty dilation 
catheter to occlude the LAD in Yorkshire pigs (45 days old). Occlusion time of 
90 min was followed by fully reperfusion to mimic the clinical MI disease model. 
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One week after MI, a left thoracotomy was performed and the patch (6-cm 
diameter) was sutured onto the infarct. Animal groups included: sham controls, 
I/R with no treatment (n = 3), I/R treated with patch alone (I/R plus patch, n = 1), 
and I/R treated with patch laden with FSTL1 (I/Rplus patch with FSTL1, n = 2). 
EdU delivery: 250mg per week EdU was infused into circulation during the 
4-week time course of study (week 1 to week 5 post I/R), using osmotic mini 
pumps (Fig. 5). 

Animal compliance. The procedures involving animal use and surgeries were 
approved by the Stanford Institutional Animal Care and Use Committee 
(IACUC). Animal care and interventions were provided in accordance with the 
Laboratory Animal Welfare Act (C57BL/6J wildtype mice (Figs 1p-t, 2h-k, 3a- 
p, 3-6) Myho"28reER: Rosq26“"°_ C57BL/6] mice (Fig. 3q-v, Extended Data 
Fig. 7a-f) Yorkshire pigs (Fig. 5)). 

The study protocol was approved by the Institutional Animal Care and Use 
Committee (IACUC) of Boston University (wild-type and Fstll-TG C57BL/6] 
mice, (Fig. 3c and Extended Data Fig. 2)). 

Mice were used in accordance with the guidelines of the Institutional Animal 
Care and Use Committee (IACUC) of the Institute for Nutritional Sciences, 
Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences (Tnt- 
cre;Rosa26"™'*, wei t?ERT?/* Rosa26"!"2/* mice (C57BL/6J and ICR mixed 
background) (Fig. 1j-m) Wel ?2/* Rosa26rt’* mice (C57BL/6] and ICR 
mixed background) (Fig. 21) 

All animal study was approved by the Institutional Animal Care and Use 
Committee (IACUC) of Sanford-Burnham-Prebys Medical Discovery Institute. 
All animal procedures performed conform the NIH guidelines (Hsd:s.d. rats 
(Fig. 4g, m, n and Extended Data Fig. 7g-j)). 

Statistical analysis. The number of samples (m) used in each experiment is 
recorded in the text and shown in figures. All in vitro experiments have been 
done at least twice independently. Gene expression experiments have been done 
3 times independently and EdU proliferation assays and cell size measurement 
have been done more than 10 times independently. Sample size was not pre- 
determined, with retrospective analysis of significantly different results in most 
in vitro studies using Gpower 3.1 produces power > 0.8. Sample sizes for animal 
studies were estimated. Animals which did not survive up to 4 weeks after surgery 
were excluded from functional and histological studies. Randomization was not 
applied. Blinding to group allocation was practiced between animals surgery and 
results analysis of mouse myocardial infarction experiments. The values pre- 
sented are expressed as means + s.e.m. The rationale to use means + s.e.m. 
instead of s.d. is that s.e.m. quantifies uncertainty in an estimate of the mean 
whereas s.d. indicates dispersion of the data from mean. In other words, the SEM 
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provides an estimate of the reported mean value, while the s.d. gives an idea of the 
variability of single observations. Normal distribution were tested and confirmed 
in automatic analysis of mCMs?S© (Figs 1c, g, i, 2d, f, g and 4d, e, j,-n and 
Extended Data Figs 1 and 7h, j). We did not estimate variations in the data. 
The variances are similar between the groups that are being statistically com- 
pared. One-way ANOVA with multiple comparisons (Fig. 1r, 3 and Extended 
Data Figs 2b, d and 4, 5, 6) and Student's t-test (Fig. la-m, 2 and 4 and Extended 
Data Fig. 2e-n and 7) were used to test for statistical significance (P < 0.05). 
Survival curve were generated by Kaplan-Meier method using PRISM 
(GraphPad) and Log-rank (Mantel-Cox) test was used to test the significant 
differences between the survival of mice in different conditions (Fig. 3a). 
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Extended Data Figure 1 | Characterization of mCMs**“ cells used in this 


study. a, Schematic time-line of cell preparation and treatment. 

b-d, Immunostaining of «-actinin of mCMs**°, showing that the majority of 
the cells are o-actinin* (b), and the «-actinin lacks striation structures 

(c). d, Immunostaining of «-smooth muscle actin (aSMA) of mCMs*°C, 
showing the majority of the cells are SMA”, unlike mature cardiomyocytes 
with no SMA expression®. e, f, Automatic detection of EdU incorporation in 
mCMs*°°. Captured image of mCMs**“ treated with 10 pg ml’ EdU for 24h, 
stained with EdU, «-actinin and DAPI using InCell 1000 (General Electric) 
(e). Overlay of masks of EdU, o-actinin and DAPI channels with automatic 
detection software (f). g, EdU incorporation profile of mCMs"*° over time. 
mCMs*°“ are treated with 10 Lg ml ! EdU for 24h at time 0h, 24h, 48h, and 
144h. The percentage of EdU*/c-actinin® cardiomyocytes of all o.-actinin* 
cardiomyocytes is calculated for each time period. Note the decrease of EdU 
incorporation rate over time. h, i, Fluo 4 calcium images of mCMs**%, with 
baseline background image (h) and peak image (i). j, Comparison of 
representative calcium transients of mCMs**“ (red) and neonatal rat 
ventricular cardiomyocytes (NRVC, blue). Note the reduced amplitude, slower 
rate of up and down strokes, and elongated duration of the calcium transient in 
mCMs*°° compared to NRVC, suggesting immature calcium handling in 
mCMs**“. In all experiments, FSTL1 was added one day after plating of the 
mCMs"*°° (time 0-24 in this figure). 
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Extended Data Figure 2 | Myocardial overexpression of Fstl1 (Fstl1-TG) 
mice after permanent LAD ligation. a-d, Fstl1 protein expression kinetics 
after myocardial infarction. Fstll-TG mice (C57/Bl6 background) and 
littermate wild-type (WT) mice underwent LAD ligation. Heart tissue and 
serum were collected at baseline, day 1, day 3, day 7 and day 28 after surgery. 
Fstl1 protein levels in ischaemic area (IA) and remote area (RM) of heart were 
analysed by western blotting (a). Fstl1 expression expressed relative to tubulin 
levels is reported (b). Fstl1 serum levels were analysed by western blotting 
(c). Also shown in Ponceau-S staining to indicate equal loading of serum. 
Quantification of serum Fstl1 level is shown in (d). n > 3 in all groups. 

*P < 0.05 compared to WT baseline, #P < 0.05 compared to Fstl1-TG baseline. 
ANOVA was used for statistical significance (P < 0.05). e-j, Morphometric 
and functional response of Fstl1-TG mice to permanent LAD ligation at long- 
term. Representative Masson’s trichrome staining of WT (e) and Fstl1-TG 
(f) 4 weeks after MI. Quantification of content in fibrotic tissue at week 4 after 
MI (g). Echocardiographic measurement of left ventricular internal dimension 
in systole (LVIDs) (h), and left ventricular internal diameter in diastole 
(LIVDd) (i) at weeks 2 and 4 after MI. Echocardiographic determination of 
fractional shortening (FS%) in the indicated genotypes at 2 and 4 weeks after MI 
(j). k-n, Double immunofluorescent staining of «-actinin (cardiomyocytes) 
and pH3 (mitosis) (k) and o-actinin (cardiomyocytes) and von Willebrand 
factor (vascular endothelial cells) (m) in the Fstl1-TG and WT mice, quantified 
in (i, n). n = 5, *P < 0.05 indicates significantly different from WT. 
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Extended Data Figure 3 | Patch with FSTL1 attenuated fibrosis after MI. 
a-f, FSTL1 retention in the patch in vitro and in vivo. a, b, Enzyme-linked 
immunosorbent assay used to measure the amount of FSTL1 retained within 
collagen scaffolds exposed to PBS in vitro for different time intervals (0- 

21 days) (a). The Table lists the initial and final FSTL1 concentration, as well as 
the release values within the first 24h (b). c-f, FSTL1 retention in the patch in 
vivo. Representative images of Fstl1 immunostaining in the indicated animal 
treatment groups, week 4 after surgery. Note that, while Fstl1 is expressed in the 
uninjured epicardium (arrow in the inset in c), its expression became 
undetectable within the infarct area after MI (d). Similarly, no FSTL1 was 
detected in the MI plus patch animals (e), while it still persists (red staining) in 
the patch area of the MI plus patch with FSTL1 group (f). g, Representative 
Masson’s Trichrome staining on serial cross sections of hearts under 4 


ih ENE od 


iepaich 
conditions (sham, MI only, MI with patch and MI plus patch with FSTL1) 
4 weeks after MI. Note the severe fibrosis in MI only condition, and reduced 
fibrosis in MI plus patch condition, and further reduction in MI plus patch with 
FSTLI condition, quantified in Fig. 3d. h-j, Representative MRI images from 
the mouse MI only, MI plus patch and MI plus patch with FSTL1 treatment 
groups showing the 3D-FSPGR (fast spoiled gradient-echo) images and the 
delayed enhancement images using gadolinium contrasting agents, confirming 
a reduction in infarct area (demarcated in green) and preserved contractility 
(Supplementary Videos 3-5). k, Trichrome staining of infarct and border zone 
of the indicated treatments demonstrates the integration of the patch with the 
host tissue and massive patch cellularization by the native cardiac cells. Observe 
the abundant muscle (red) inside the patch and in the border zone of the patch 
with FSTLI treated animals (three right panels, green arrowheads). 
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Extended Data Figure 4 | Analysis of patch with FSTL1 function in the 
mouse model of ischaemia/reperfusion (I/R) with delayed patch grafting. 
a-c, Heart function evaluation for sham, I/R, and I/R treated with patch with 
FSTLI, at end- diastolic and systolic, pre-grafting (a, 1 week post-injury), 

2 weeks post patch implantation (b), and 4 weeks post grafting (c). Values were 
normalized by dividing to pre-surgery baseline values for each individual 
animal. d, Absolute values of fractional shortening (FS, %) at different times 
pre and post I/R as evaluated by echocardiography of mice from 


a-c. Abbreviations same as in Fig. 3. *P < 0.05 compared to sham and black 
circle P< 0.05 compared to I/R. e, Co-immunofluorescence staining of DNA 
duplication marker phospho-Histone3 Ser10 (pH3, green) and «-actinin (red) 
in the border zone of patch with FSTLI treated heart 4 weeks after MI. 

f, Quantification of incidence of pH3*, a-actinin’ double positive cells in the 3 
experimental groups. Data collected from 3 hearts in each group with 3 
different cross sections counted for total pH3", a-actinin® cells in each heart. 
*P < 0.05 indicates statistically different from all other groups. 
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Extended Data Figure 5 | Representative images and quantification of 
cardiomyocyte proliferation in vivo after patch with FSTL1 treatment. 
a-h, Immunostaining of the cardiomyocyte marker «-actinin (red) in the 
infarct area (b-d) and co-immunofluorescence staining of DNA duplication 
marker phospho-Histone3 Ser10 (pH3, green) and o-actinin (red) in the border 
zone (f-h), in the 4 treatment groups analysed 4 weeks post-MI, compared to 
sham-operated animals (a, e). Insets in (a-d) show lower magnification images 
with broken lines demarcating the border between the patch and host tissues. 
Arrowheads in g, h, indicate a-actinin* cardiomyocytes with pH3* nuclei. 
i-k, Representative images of pH3* cardiomyocytes in a patch with FSTL1 
treated heart. Masson’s Trichrome staining of a heart after MI 4 weeks treated 
with patch with FSTL1 (i). The adjacent slide was stained for o:-actinin in 

j, corresponding to the black box area with infarction and the patch in i. The 
spotted line in j indicates the boundary between the heart and the patch. The 
adjacent slide was stained for -actinin and pH3, and all a-actinin’, pH3* 
double positive cardiomyocytes found were shown in k (white arrowhead), 
with each image corresponding to the area in numbered white boxes in 

j. Ln, Quantification of cardiomyocyte proliferation measured in 3 cross 
sections covering the infarct, patch, and separated by 250 um, between 1-2 mm 
from the apex in each heart (Fig. 3j). Data collected from 5-7 hearts in each 
group with the 3 cross-sections counted exhaustively for incidence of 


a-actinin™ cells positive for pH3 (1), midbody-localized aurora B kinase 
between -actinin” cells (m), and double-positive cells for pH3 and the nuclear 
cardiomyocyte maker PCM1 (n), and normalized to myocardium area 
quantified by trichrome staining of immediate adjacent section. *P < 0.05 
statistically different from sham. **P < 0.05, statistically different from all 
other groups. 0, Quantification of hypertrophy in all experimental groups, 
measured by counting cardiomyocytes in areas of intraventricular wall with 
perpendicular cross-sections of cardiomyocytes in all hearts analysed for 
cardiomyocyte proliferation. No significance were found between samples. 
p-r, Quantification of incidence of -actinin™ cells positive for pH3 

(p), midbody-localized aurora B kinase between a-actinin™ cells (q), and 
double-positive cells for pH3 and the nuclear cardiomyocyte maker PCM1 
(r) measured in 1, m, to total number of cardiomyocytes, calculated using 
hypertrophic analysis results in 0. *P < 0.05, statistically different from sham, 
**P < 0.05: statistically different from all other groups. s, t, Quantification of 
incidence of #-actinin™ cells positive for pH3 (s) and midbody-localized aurora 
B kinase between a-actinin™ cells (t), separated by their localization in the 
border zone or infarcted area. Note the majority of proliferation quantified by 
both methods are located in the border zone, *P < 0.05, statistically different 
from all other groups. 
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Extended Data Figure 6 | Effect of implantation of patch with FSTL1 on 
apoptosis and inflammation. a, Representative TTC staining of day 2 post 
MI/patch treatment of all four groups (sham, MI, MI plus patch, MI plus patch 
with FSTL1). b, Quantification of area at risk comparing all 4 groups. Data 
collected from 4 hearts in each group, with 4 cross-sections, approximately 
2mm thick each, encompassing each heart. *P < 0.05, statistically different 
from the sham. ¢, d, Representative image of TUNEL assays (TUNEL, green, 
a.-actinin, red) comparing hearts 2 days after MI with patch alone and patch 
with ESTLI. e, Quantification of TUNEL", o-actinin* in infarcted area, as 
percentage of total number of cardiomyocyte. No difference is observed 
between MI plus patch and MI plus patch with FSTL1 conditions. Data 
collected from 3 hearts in each group with 3 different cross-sections (same as in 
Fig. 3j) Ten 0.09 mm” images were taken from infarcted area of each section and 
counted for TUNEL", «-actinin* and total a-actinin® cells. f-j, TUNEL 
staining for cell death and «-actinin staining for cardiomyocytes were 
performed on hearts treated with patch-only and patch with FSTL1 at day 4 and 
day 8 after MI (f-i). Minimal TUNEL“, o-actinin™ cells are detected while 
there are signification amount of TUNEL‘, «-actinin’ cells. Quantification of 
all TUNEL” nuclei showed no significant differences between patch and patch 
with FSTL1 treated hearts at both time points (j). kK-o, Immunostaining of F4/ 
80 for macrophages and o-actinin for cardiomyocyte were performed on the 
same hearts as in panels a-d (k-n). Quantification of F4/ 80° cells showed no 
significant differences between patch and patch with FSTL1 treated hearts at 
both time points (0). 
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Extended Data Figure 7 | FSTL1 does not induce proliferation in adult and 
neonatal cardiomyocytes, or cardiac progenitor cells. a—f, Adult 
cardiomyocytes derived from mouse primary isolation. a, Visualization of 
GFP* cardiomyocytes isolated from Myho""®"?"”®:Rosa26”"> mice treated 
with 4-OH-tamoxifen (OH-Tam) in 3D-collagen patches. b-d) Gene 
expression changes in adult cardiomyocyte treated with FSTLI, including 
proliferation (b), cardiac-specific (c), and hypertrophy (d) markers. Note no 
changes in expression of cardiac specific genes, no increase in cell cycle markers 
(consistent with undetectable Ki67 immunostaining), and decreased 
hypertrophy markers (n = 3). Cardiomyocytes were embedded within 3D 
patch were treated with FSTL1 (10 ngml *) for duration of 7 days with media 
change every 2 days. e, f, FUCCI assay in 3D-cultured adult cardiomyocytes, 
conducted 1 week after the 3D culture. e, Treatment with FSTL1 was performed 
for 7 days with media change every 2 days. f, Adult cardiomyocytes 3D-cultured 
control in absence of FSTL1. Note no detectable sign of cardiomyocytes in 
S/G2/M phases (GFP*) in either condition. Purple arrows point to purple- 
colored nuclei resulting from co-localization of Hoechst (blue) and G1 phase 


FUCCI (red) labelling. g-j, Primary neonatal rat ventricular cardiomyocytes 
(NRVC). g, h, Freshly isolated NRVCs stimulated with FSTL1 for 48 h with 
10 pg ml! EdU, and stained for o-actinin (red) and EdU (green). Percentages 
of EdU*/a-actinin® cardiomyocytes of all o-actinin* cardiomyocytes are 
quantified (h). i, j, NRVCs stimulated with FSTL1 for 48h, and stained for 
a-actinin (red) and pH3 (green). Percentages of pH3*/o-actinin™ 
cardiomyocytes of all «-actinin® cardiomyocytes are quantified (j). No increase 
of proliferation is found upon FSTLI treatment. (n = 4) *P < 0.05, statistically 
different from control,. k-m, Scal* progenitor cells'® were starvation- 
synchronized for 48h and stimulated with FSTL1 or control growth medium 
for 72h in presence of EdU. Clone 3 was obtained by clonal growth from the 
Lin-Scal+SP fraction. Scal pool was obtained from lin-Scal* without clonal 
growth. k, EdU and DAPI staining of Scal* cells after 72h treatment. 

1, Percentage of EdU* Scal* cells after 72 h treatment. FSTL1 concentration: 0, 
1, 10, 100 ng ml !. Abbreviation s; SS, serum starvation; CGM, control growth 
medium. m, Number of Scal™ cells after 72h FSTL1 treatment (n = 5). No 
significant change is found upon FSTL1 treatment. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


Extended Data Table 1 | Raw echocardiography values (average + s.e.m.) obtained at days 0 (baseline), 14, and 28 post treatment in a mouse 
model of permanent LAD ligation 


LVIDd LVIDs LvPwd LVPWs EF FS 
(mm) (mm) (mm) (mm) (%) (%) 
Sham 3.89 + 2.444 1.10 + 1.44 + 75.40 + 37.76 + 
Baseline, n=10 0.07 0.07 0.06 0.04 1.33 1.48 
Ml-only 3.94 + 2.28 + 1.29 + 1.68 + 78.53 + 43.50 + 
Baseline, n=10 0.06 0.09 0.09 0.11 1.68 1.82 
MI+Patch 4.08 + 2.50 + 1.07 + 1.49 + 77.75 + 41.54 + 
Baseline, n=10 0.10 0.16 0.03 0.12 2.67 1.98 
MI+Patch+CM 3.84 + 2.28 + 1.194 1.60 + 77.63 + 40.72 + 
Baseline, n=10 0.09 0.07 0.04 0.05 0.71 0.66 
MI+Patch+FSTL1 4.04 + 2.42 + 1.19 + 147+ 73.12 + 39.12 + 
Baseline, n=10 0.14 0.11 0.05 0.08 2.52 2.04 
Sham 3.96 + 2.53 +4 1.144 1.434 7303 38.01 + 
Week 2, n=10 0.17 0.11 0.04 0.07 2.21 1.58 
Ml-only 5.31 + 4.08 + 0.85 + 0.91 + 35.58 + 16.48 + 
Week 2 post injury, n=8 0.23 0.29" 0.05 0.09 3.57 1.67 
Mi+Patch 4.63 + 3.56 + 1.114 1.36 + 48.17 + 21.69 + 
Week 2 post injury,n=8 = 0.08** 0.16*° 0.05° 0.09° R\spee Die 
Mi+Patch+CM 445+ 3.10+ 1.124 1.40 + 64.18 + 30.35 + 
Week 2 post injury,n=8  0.16** 0.15*° 0.04*° 0.06*° 1.90*°" 1.30*°" 
Mi+Patch+FSTL1 4.77+ 3.72 + 1.02+ 1.144 50.31 + 23.17 + 
Week 2 post injury,n=9 = 0.23** gee oor". 0.08*° SigAnae a (asooe 
Sham 3.98 + 255+ 1.02 + 1.414 71.32 + 35.17 + 
Week 4, n=10 0.09 0.12 0.07 0.07 2.63 1.60 
Ml-only 5.27 + 455+ 0.70 + 0.78 + 35.32 + 1271+ 
Week 4 post injury, n=8 0.20* 0.24* 0.04* 0.06* 2.64* 1.28" 
Mi+Patch 5.04 + S77 + 1.02 + 1.284 51.43 + 22.71 + 
Week 4 post injury,n=8  0.12*° 0.11*° 0.08° 0.08° 1.62*° 0.92*° 
Ml+Patch+CM 435+ 3.20 + 1.234 1.534 62.15+ 27.52+ 
Week 4 post injury, n=8 018°" (24 0.05° 0.06° 436°" 250° -5 
MI+Patch+FSTL1 4.244 2.86 + 1.154 1.50 + 67.73 + 32.87 + 
Week 4 post injury, n=9 = 0.12°" 0.13*° 0.06° 0.08° 1.79°" 426°"" 


The patch was implanted simultaneously with injury. *P < 0.05 statistically significant difference in comparison with Sham. Black circle (P < 0.05), statistically significant difference in comparison with Ml-only. 
black square (P < 0.05) statistically significant difference in comparison with MI plus patch; black triangle statistically significant difference (P < 0.05) in comparison with MI plus patch plus CM. 
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Extended Data Table 2 | Raw echocardiography values (average + s.e.m.) ina long term (months 2 and 3) post treatment, in a mouse model of 
permanent LAD ligation 


LVIDd LVIDs LvPwd LVPWs EF FS 
(mm) (mm) (mm) (mm) (%) (%) 
Sham 4.26 + 2.68 + 1.20 + 1.414 72.00 + 35.75 + 
Month 2, n=4 0.05 0.03 0.03 0.07 0.41 0.25 
MI 5.53 + 476+ Dee 0.91 + 33.98 + 13.88 + 
Month 2, n=4 0.17* 0.19* 0.04* 0.09* 3.24* 1.56* 
Mli+Patch 5.05 + 3.92 + 0.92 + 1.274 51.13 + 21.50 + 
Month 2, n=4 0.09** 0.04*° 0.09* 0.08* 2.70*° 1.44*° 
MI+Patch+FSTL1 3.96 + 2.58 + 1.27+ 1.50 + 68.25 + 33.19 + 
Month 2, n=4 0.16°" 0.09°" 0.04°" 0.09°" 075" O70" 
Sham 441+ 3.01 + 1.124 1.344 67.00 + 32.38 + 
Month 3, n=4 0.13 0.13 0.02 0.01 1.74 1.28 
MI 4.93 + 414+ 0.90 + 0.99 + 38.00 + 15.33 4 
Month 3, n=4 27" 0.16* 0.03* 0.09* 2.65* 1.20* 
Mi+Patch 4.94 + 3.68 + 0.90 + 1.18 + 53.50 + 23.58 + 
Month 3, n=4 0.23 0.22* 0.10 0.02* 2.63*° 1.57*° 
MI+Patch+FSTL1 4.65 + 3.19 + 0.93 + tle1l7/ = 66.19 + 31.38 + 
Month 3, n=4 0.04° 0.06°" 0.09 0.07°" 1.12°8 1.46°" 


The patch was implanted simultaneously with injury. *P < 0.05, statistically significant difference in comparison with sham. Black circle, statistically significant difference (P < 0.05) in comparison with Ml-only, 
black square, statistically significant difference (P < 0.05) in comparison with MI plus patch; black triangle, statistically significant difference (P < 0.05) in comparison with MI plus patch plus CM. 
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Extended Data Table 3 | Raw echocardiography values (average + s.e.m) of delayed grafting in a mouse model of ischaemia/reperfusion 


LVIDd LVIDs LVPWd —LVPWs EF FS 
(mm) (mm) (mm) (mm) (%) (%) 
Sham 3.62 + 2.39 + 1.134 1.44 + 7498+ 3567+ 
Baseline, n=4 0.15 0.14 0.09 0.10 257 1.03 
IR 3.90 + 2.59 + 1.26 + 1.45 + 73.064 36.814 
Baseline, n=4 0.10 0.03 0.05 0.11 2.98 2.36 
IR+Patch+FSTL1 4.09 + 2.67 + 1.27 + 1.50 + 7750+ 40.834 
Baseline, n=4 0.09 0.06 0.03 0.05 2.57 2.24 
ee ee 3.62 + 2.274 116+ 1.434 71.674 35.504 
abate tel 0.07 0.05 0.05 0.03 see 2.50 
Pre-graft, n=4 
cee eset 4.27 + 3.33 + 1.00 + 1.124 48.33+ 2213+ 
* * * * * 
Pre-graft, a=4 0.16 0.16 0.03 0.03 0.51 0.84 
Gaerne aie 434+ 3.49 + 1.02 + 0.99 + 46.69 + 19.74 + 
rs * * * * 
Br aIaiicd 0.32 0.32 0.03 0.05 3.04 1.57 
eh Saccoiias 3.78 + 2.46 + 1.154 1.35 + 69.95+ 34.044 
ce ope aie fed 0.24 0.15 0.06 0.06 3.59 2.44 
uae apostinaey 4.49 + 3.56 + 0.91 + 0.99 + A722+ 20124 
* * * * * * 
et Sse rd 0.16 0.14 0.07 0.04 2.54 1.36 
Papal dies 4344 2.77+ 1.08 + 1.44 + 70.78 + 33.67 + 
ee post injury e x@ e e x® 
Week 2 post graft, n=4 0.12 0.14 0.00 0.04 2.42 2.19 
es See uiniOny 4.03 + 275+ 1.05 + 134+ 6646+ 33.084 
Wscka ceevarniened 0.16 0.21 0.09 0.10 3.92 2.47 
cee spostigniey 4.63 + 3.84 + 0.87 + 0.95 + 42.08 + 16.81 + 
* * * * * 
WEA aig Ae 0.16 0.10 0.05 0.09 0.96 1.13 
Dae sii 4.09 + 257+ 1.20 + 141+ 73.16 + 34.44 + 
eaten ohaaren aie A 0.15* 0.12° 0.04° 0.10° 1.13° 0.87° 


Week 4 post graft, n=4 


Data obtained at baseline (pre-injury, pre-grafting), weeks 1 (post injury, pre grafting) and weeks 3, 5 (post-injury, post-grafting). The patch was implanted 1 week after injury. *P < 0.05, statistically significant 
difference in comparison with sham; black circle indicates statistically significant difference (P < 0.05) in comparison with Ml-only; black square indicates tatistically significant difference (P < 0.05) in comparison 
with MI plus patch; black triangle statistically significant difference (P < 0.05) in comparison with MI plus patch plus CM. 
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Structure of the toxic core of a-synuclein 


from invisible crystals 


Jose A. Rodriguez'*, Magdalena I. Ivanova!*+, Michael R. Sawaya'*, Duilio Cascio, Francis E. Reyes**, Dan Shi’, 

Smriti Sangwan!, Elizabeth L. Guenther’, Lisa M. Johnson!, Meng Zhang", Lin Jiang'+, Mark A. Arbing', Brent L. Nannenga’, 
Johan Hattne?, Julian Whitelegge’®, Aaron S. Brewster*, Marc Messerschmidt°+, Sébastien Boutet®, Nicholas K. Sauter*, 
Tamir Gonen? & David S. Eisenberg’ 


The protein a-synuclein is the main component of Lewy bodies, the neuron-associated aggregates seen in Parkinson 
disease and other neurodegenerative pathologies. An 11-residue segment, which we term NACore, appears to be 
responsible for amyloid formation and cytotoxicity of human a-synuclein. Here we describe crystals of NACore that 
have dimensions smaller than the wavelength of visible light and thus are invisible by optical microscopy. As the crystals 
are thousands of times too small for structure determination by synchrotron X-ray diffraction, we use micro-electron 
diffraction to determine the structure at atomic resolution. The 1.4 A resolution structure demonstrates that this method 
can determine previously unknown protein structures and here yields, to our knowledge, the highest resolution 
achieved by any cryo-electron microscopy method to date. The structure exhibits protofibrils built of pairs of 
face-to-face B-sheets. X-ray fibre diffraction patterns show the similarity of NACore to toxic fibrils of full-length 
a-synuclein. The NACore structure, together with that of a second segment, inspires a model for most of the ordered 
portion of the toxic, full-length a-synuclein fibril, presenting opportunities for the design of inhibitors of a-synuclein 


fibrils. 


The presynaptic protein «-synuclein, found in both soluble and mem- 
brane-associated fractions of the brain, aggregates in Parkinson dis- 
ease (PD). These aggregates are the main component of Lewy bodies, 
the defining histological feature of this neurodegenerative disease, and 
have been shown to accompany neuronal damage’. Two other obser- 
vations point to aggregated o-synuclein as a molecular cause of PD’. 
The first is that families with inherited forms of PD carry mutations in 
o-synuclein, such as A53T, and abundant Lewy bodies**. The second 
is that families with duplicated or triplicated genes encoding a-synu- 
clein develop early-onset PD, presumably because at high local con- 
centrations -synuclein is forced into amyloid®’. 

Our focus is on a central segment of %-synuclein, residues 68-78, 
that we term NACore (Fig. 1), because of its critical role in both the 
aggregation and cytotoxicity of «-synuclein. NACore lies within a 35- 
residue domain of o-synuclein termed NAC (non-amyloid-B com- 
ponent, originally reported to be deposited with amyloid-B in the 
brains of Alzheimer’s disease patients), which has been established 
as necessary and sufficient for the aggregation and toxicity of 
a-synuclein* *(Extended Data Fig. 1). For example, deletion of resi- 
dues 71-82 prevents aggregation of «-synuclein in vitro, and abolishes 
both its aggregation and neurotoxicity in a Drosophila model of PD”. 
Yet this segment in isolation from the rest of o%-synuclein readily 
forms amyloid fibrils and is highly cytotoxic'*"*. Also, B-synuclein, 
the close homologue of «-synuclein, which does not aggregate and is 
not found in Lewy bodies, differs in sequence from o-synuclein prin- 
cipally by the lack of residues 74-84 that are part of NACore’. 

Segments outside NAC also influence the aggregation of o-synuclein 
and have been associated with fibril structure'*”. In brain extracts from 


patients with multiple system atrophy, the core of a-synuclein fibrils 
extends approximately from residue 30 to 100’*. Also the A53T mutation 
of o-synuclein can accelerate its transition into the amyloid state, and 
hence accelerate PD’’. This mutation was found to induce the onset of 
PD at an early age”, and consistent with this, «-synuclein containing this 
A53T mutation forms fibrils in vitro more rapidly than the wild type’. 
Thus we carried out screens for crystals of peptide segments within the 
NAC domain and adjacent regions, seeking structural information on 
the molecular basis of aggregation and toxicity of «-synuclein. 


Structure determination by MicroED 
Extensive crystal screens of two segments, NACore, residues 
6b8GAVVTGVTAVA zs, and PreNAC, 4,GVVHGVTTVAs., seemingly 
produced non-crystalline, amorphous aggregates. But upon examina- 
tion using electron microscopy, we found the aggregates to be clusters 
of elongated nanocrystals only 50-300 nm in cross section and thus 
invisible by conventional light microscopy (Fig. 1). We confirmed well- 
ordered crystallinity of NACore at both the SACLA and LCLS free 
electron lasers. We also found that a nine-residue fragment within 
the NACore, which we term SubNACore, 65,AVVIGVTAV7,, yielded 
crystals 1,000-10,000 times larger in volume than the NACore nano- 
crystals (Fig. 1). We were therefore able to apply synchrotron meth- 
ods” to these larger crystals to determine the structure of their 
amyloid-like fibrils. Although this nine-residue fragment is missing 
only two residues compared with NACore, it is not as toxic”’, offering 
some insight into the toxicity of «-synuclein, as described below. 

To determine the structure of the invisible crystals of NACore and 
PreNAC, we turned to micro-electron diffraction (MicroED)™*”*. In 


1Howard Hughes Medical Institute, UCLA-DOE Institute, Departments of Biological Chemistry and Chemistry and Biochemistry, Box 951570, UCLA, Los Angeles, California 90095-1570, USA. *Howard 
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4Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA. 5Linac Coherent Light Source, SLAC National Accelerator Laboratory, Menlo Park, California 94025, 
USA. +Present addresses: Department of Neurology and Program of Biophysics, University of Michigan School of Medicine, Ann Arbor, Michigan 48109, USA (M.I.I.); Department of Neurology, UCLA, Los 
Angeles, California 90095, USA (LJ.); National Science Foundation BioXFEL Science and Technology Center, Buffalo, New York 14203, USA (M.M.). 


*These authors contributed equally to this work. 


486 | NATURE | VOL 525 | 24 SEPTEMBER 2015 


©2015 Macmillan Publishers Limited. All rights reserved 


@ Comellas et al. 2011 ==) =—=>=)==>_ ssNMR 
Vilar et al. 2008 2 ——=) =o) ———= — ssNMR, HD 
Chen et al. 2007 D2) —e>a=)eee> SDSL, EPR 


AS3T 
v 


Amphipathic N terminus NAC 
1 61 95 


Acidic C terminus 


140 


GVVHGVTTVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAGS | AAATGFVK 


kcal mor 
b = ae | 


a/GWHGVTTVA,, gg AVVTGVTAV,, 


Figure 1 | NACore (residues 68-78) is the fibril-forming core of the NAC 
domain of full-length a-synuclein. a, Top, segments identified as B-strands 
by electron paramagnetic resonance (EPR), solid state nuclear magnetic 
resonance (ssNMR), hydrogen-deuterium exchange (HD), and site directed 
spin labelling (SDSL)'”*°". Bottom, predictions of the propensity of six-residue 
segments to form amyloid fibrils. The vertical axis indicates the propensity 
of steric zipper formation in Rosetta energy units**. Zipper-forming segments 
are predicted where bars cross the —23 kcal mol” ' threshold marked by the 
dashed line. Blue-to-red color gradient indicates weak-to-strong propensity of 
steric zipper formation. The A53T early-onset Parkinson mutation is indicated 
by a red arrow and red letter T. b, The miniscule size of the preNAC and 
NACore crystals used for MicroED is illustrated by this comparison to 
SubNACore microcrystals. Scale comparisons are illustrated on two 
magnifications using phase contrast light microscope images and electron 
micrographs, in which individual NACore and PreNAC nanocrystals are 
indistinguishable by light microscopy. 


MicroED, an extremely low-dose electron beam is directed onto a 
nanocrystal within a transmission electron microscope under cryo- 
genic conditions, yielding diffraction patterns such as those in Fig. 2. 
As the wavelength used in our experiments at 200 keV is very small 
(0.025 A), the Ewald sphere is essentially flat, resulting in diffraction 
patterns that closely resemble a 2D slice through 3D reciprocal space. 
As the crystal is continuously rotated in the beam, a series of such 
diffraction patterns is collected”*. Scaling together diffraction data 
collected from multiple crystals produces a full 3D diffraction data 
set. MicroED has been successfully applied to the well-known struc- 
tures of hen egg-white lysozyme**”*, bovine liver catalase”” and Ca** - 
ATPase”®. But NACore and PreNAC are the first previously unknown 
structures determined by MicroED. 

For NACore and PreNAC, we collected MicroED patterns from 
nano-crystals that lay preferentially oriented, flat on the surface of a 
holey carbon Quantifoil grid, in a frozen-hydrated state. Grids were 
first screened for appropriately sized crystals, and candidate crystals 
screened for diffraction. We used crystals showing strong diffraction 
for data collection by continuous unidirectional rotation about a fixed 
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Figure 2 | Diffraction from NACore nanocrystals is similar to that from full 
length o-synuclein fibrils. a, Single-crystal electron diffraction pattern 
obtained during MicroED data collection (see text). Equally spaced concentric 
rings denote resolution shells. The highest resolution spot is at 1.52 A (arrow). 
The inset shows the over-focused image of the diffracting crystal (arrow), 
which is ~1,480 X 200 X 200 nm. Orientation of the reciprocal cell axes are 
indicated by the arrows labelled a*, b*, c*. b, Composite of fibril diffraction 
patterns from o-synuclein («-syn) preparations and NACore. Full-length 
a-synuclein reveals reflections that match those from NACore and 
N-terminally acetylated «-synuclein. The two patterns of full-length 
o-synuclein share with NACore three major peaks denoted by arrows: 8.2 A 
(orange), 4.6 A (blue), and 2.4 A (green). The origin of these peaks can be traced 
to the (0,0,2), [(1,1,1),(—1,1,1)], and (0,2,0) planes in the NACore structure, 
respectively. We attribute the strong 8.2 A reflection to the spacing between 
adjacent pairs of f-sheets. 


axis, acquiring a series of diffraction frames at fixed time intervals”. 
The needle-shaped crystals typically exceeded the length needed for 
MicroED; those that were unbent and 100 to 300 nm wide produced 
the best diffraction patterns. Data from multiple crystals were inte- 
grated, scaled and merged together (Extended Data Table 1). 

The multi-crystal NACore and PreNAC data sets were phased by 
molecular replacement, using the atomic model of SubNACore and 
an ideal B-strand model, respectively, as probes. Residues of NACore 
which were missing from the SubNACore probe were clearly revealed 
in a difference density map calculated from NACore observed struc- 
ture factor amplitudes and phases from the SubNACore probe struc- 
ture (Extended Data Fig. 2). After subsequent refinement, two water 
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Figure 3 | Structure of the amyloid core of a-synuclein. a, The crystal 
structure of NACore (orange) reveals pairs of sheets as in the spines of amyloid 
fibrils. The A53T mutation (black) is shown in PreNAC (blue). The sheets in 
both structures are related by the 2, fibril axes shown in black. The gaps left by 
the interface are filled with water molecules which hydrogen-bond to the 
threonine residues (partially showing aqua spheres). b, c, Orthogonal views of 
the fibrillar assemblies. d, A speculative model of an o-synuclein protofibril 
containing the A53T mutation (black), where the strong interface of NACore 
(orange) forms the core of the fibril and its weaker interface interacts with 
PreNAC (blue). e, The locations of 5 out of a possible 73 protons are suggested 
by small, positive F, — F, density (green contoured at 2.80, shown by arrows). 
The blue mesh is 2F, — F- density contoured at 1.40. 


molecules, and several hydrogen atoms, were observed (Fig. 3e). Full 
models of NACore and PreNAC were refined against the MicroED 
data, producing structures at 1.4 A resolution with acceptable R fac- 
tors (Extended Data Table 1). Electron scattering factors were used in 
the refinement calculations”. 
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NACore structure 


The structure of the NACore peptide chain is a nearly fully extended 
B-strand (Fig. 3 and Extended Data Fig. 3). These NACore strands stack 
in-register into B-sheets, as had been predicted by site-directed spin 
labelling'*'”. The sheets are paired (Fig. 3b), as is usual in amyloid spines, 
and the pairs of sheets form typical steric-zipper protofilaments (Fig. 3c), 
previously seen as the spines in many amyloid-like fibrils formed from 
short segments of fibril-forming proteins*’. The unusual features of this 
steric zipper are that the 11-residue width of the zipper is longer than has 
been previously observed”, and each pair of sheets contains two water 
molecules, each associated with a threonine side chain within the inter- 
face. Most steric zippers are completely dry. Also, in our crystals of 
NACore, each sheet forms two snug interfaces: interface A, with 
268 A” of buried accessible surface area per chain, is more extensive 
and presumably stronger than interface B (167 A’), because the terminal 
residues of the chains in opposing sheets bend towards each other (Fig. 3 
and Extended Data Fig. 4). The structure of PreNAC reveals a peptide 
chain that forms a f-strand kinked at Gly51. These strands are arranged 
into pairs of B-sheets that, like the NACore structure, interdigitate to 
form steric zipper protofilaments (Fig. 3). Of special note, a five-residue 
segment of PreNAC (;;GVTTV55) differs in only one residue from a five- 
residue segment of NACore (,3GVTAV-,), and their «-carbons super- 
impose closely with a root mean square deviation (r.m.s.d.) of 1.5A 
(Extended Data Fig. 4). This means that the weaker interface B of 
NACore mimics a hypothetical interface between NACore and 
PreNAC (Fig. 3d). 


Relevance of NACore to Parkinson disease 


The relationship of the structure of NACore to fibrils of full length 
a-synuclein is established by the resemblance of their diffraction pat- 
terns. Specifically, the fibre diffraction pattern of aligned fibrils of full- 
length and N-terminally acetylated* «-synuclein protein display the 
same principal peaks as the diffraction of aligned NACore nanocrys- 
tals (Fig. 2). All three fibrils display the strong reflection at 2.4 A in 
their diffraction patterns. As seen in Fig. 3 and Extended Data Fig. 5 
this reflection arises in NACore because one f-sheet of the steric 
zipper is translated along the fibre axis with respect to the other 
B-sheet by 2.4 A, one half the 4.8 A spacing between f-strands, per- 
mitting the two sheets to interdigitate tightly together. All three share 
a strong 46A reflection, which in NACore results from both the 
stacking of B-strands and the staggering between adjacent -sheets 
of the steric zipper, while a shared reflection at near 8.2A probably 
arises from the distance between the adjacent pairs of B-sheets that 
make up the o-synuclein fibril (Fig. 2 and Extended Data Fig. 5). This 
comparison of fibre diffraction patterns (Extended Data Table 2) 
strongly suggests that the structure of NACore is similar to the spine 
of the toxic fibrils of full «-synuclein. 

The combined structures of NACore and PreNAC allow us to 
construct a speculative model for much of the ordered segments of 
the A53T early-onset mutant o-synuclein (Fig. 3d). Experimental 
support of this model comes from the agreement of its simulated fibre 
diffraction with the measured diffraction patterns of #-synuclein and 
N-acetyl o-synuclein fibrils, as well as aligned NACore nanocrystals 
(Extended Data Table 2). Above we hypothesized that the weaker 
interface B of NACore might mimic an intramolecular interaction 
of PreNAC with NACore (Fig. 3). In fact, the interacting side chains 
in the weaker NACore interface B (G73, T75 and V77) are identical to 
the side chains (G51, T53, V55) interacting in the hypothetical inter- 
face of PreNAC with NACore. Assuming that this interface actually 
forms in fibrils of the early-onset mutant A53T, we built the model 
shown in Fig. 3d. The hypothetical interface of this model offers 
a possible reason for a greater propensity of the A53T mutant to 
aggregate than the wild-type sequence, conceivably leading to the 
early onset of PD. 

The identity and structure of the cytotoxic amyloid formed by 
a-synuclein remains a subject of intensive research'?’'~**. The weight 
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Figure 4 | NACore aggregates faster than SubNACore and is more cytotoxic 
to cultured cells. a, b, Cytotoxicity of NACore, SubNACore and «-synuclein 
measured on PC12 cells using a 3-(4,5-dimethylthiazol-2-yl)-2,5- 
diphenyltetrazolium bromide (MTT) assay (a) and a lactate dehydrogenase 
(LDH)-release assay (b). In both assays NACore is more toxic than 
SubNACore. Also, shaken fibrils are more toxic than an equal concentration of 
freshly dissolved sample. Results shown as mean + s.e.m. based on triplicate 
samples. A t-test was used to measure statistical significance; *P < 0.05, 

***P < 0.001. c, Representative electron micrographs of NACore, SubNACore 


of evidence over the past decade has tilted scientific opinion from the 
fully developed amyloid fibrils found in Lewy bodies as the toxic 
entities to smaller, transient amyloid oligomers. Yet recently, quant- 
itative arguments have been put forward in favour of fibrils**. Our 
experiments of the cytotoxicity of NACore on PC12 cells (Fig. 4) are 
consistent with the view that fibrils are toxic: we find that NACore 
shaken and aggregated for 72 h displays abundant fibrils, is more toxic 
than freshly dissolved NACore (Fig. 4), and is comparably toxic to 
similarly aggregated full «-synuclein. We also find greater cytotoxicity 
of NACore than SubNACore, which is shorter by two residues. This is 
consistent with the more rapid fibril formation of NACore than of 
SubNACore (Fig. 4d). These observations do not rule out the forma- 
tion of a non-fibrillar, oligomeric assembly, present, but undetected, 
in our aggregated samples of NACore and «-synuclein. Of course, 
NACore is merely a fragment of full length o-synuclein, and lacks 
most of the membrane-binding motifs of the N terminus of the pro- 
tein, which have been implicated in membrane disruption*”™*. Yet it is 
clear that NACore is the minimum entity that recapitulates all the 
features of full length «-synuclein aggregation and toxicity. 


MicroED diffraction of invisible crystals 

The miniscule size of NACore crystals is typical of amyloid and also of 
various other biological crystals of interest. For amyloid crystals, our 
speculation is that the tiny size is a consequence of the natural twist of 
B-sheets that form the protofilaments of the fibrils. The crystal lattice 
restrains the twist, creating a strain in these crystals, which increases 
as crystals grow. Eventually this strain prevents further addition of 
B-strands, limiting the thickness of the needle crystals. In our experi- 
ence, longer segments (for example, 11 residues compared to 9 resi- 
dues) limit crystal growth even more; in the case of 11-residue 
NACore and 10-residue PreNAC, the strain produces nanocrystals, 
invisible by optical microscopy. These crystals are too small for 
mounting and conventional synchrotron data collection, but are 
ideally suited for analysis by MicroED. They are ~10'° times smaller 
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and «-synuclein (a-syn) samples tested for cytotoxicity. NACore and 
a-synuclein show abundant fibrils but SubNACore shows few. NACore also 
forms fibrils immediately upon dissolving, whereas SubNACore shows no 
fibres, but instead amorphous aggregates. Scale bar, 500 nm. d, NACore and 
SubNACore were aggregated in identical conditions and monitored by 
turbidity. NACore begins to aggregate after 15h while SubNACore forms no 
aggregates for up to 50 h. Electron microscopy of the samples at 50 h confirmed 
the turbidity readings (insets; scale bars, 2 1m), with error bars denoting 
standard deviation based on triplicate samples. 


than Perutz’s haemoglobin crystals and ~ 10’? times smaller than von 
Laue’s CuSO, crystal, which yielded the first X-ray diffraction pattern. 
Our structures of NACore and PreNAC demonstrate that MicroED is 
capable of determining new and accurate structures of biological 
material at atomic resolutions. This finding paves the way for applica- 
tions of MicroED to other biological substances of importance for 
which only nanocrystals can be grown. In our particular application, 
we have been able to learn the atomic arrangement of the core of the 
crucial NAC domain. This presents opportunities for structure-based 
design of inhibitors of amyloid formation of «-synuclein”. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Data reporting. No statistical methods were used to predetermine sample size. 
The experiments were not randomized. The investigators were not blinded to 
allocation during experiments and outcome assessment. 

Crystallization. Microcrystals of SubNACore (ggAVVTGVTAV,7) were grown 
from synthetic peptide purchased from CS Bio. Crystals were grown at room 
temperature by hanging drop vaporization. Lyophilized peptide was dissolved in 
water at 2.9mgml ' concentration in 48 mM lithium hydroxide. Peptide was 
mixed in a 2:1 ratio with reservoir containing 0.9 M ammonium phosphate, and 
0.1 M sodium acetate pH 4.6. 

Nanocrystals of NACore, ¢5GAVVTGVTAV3, were grown from synthetic 
peptide purchased from CS Bio. Ten batches of synthesized peptide (CSBio) at 
a concentration of 1 mg ml’ in sterile water were shaken at 37 °C on a Torrey 
Pines orbital mixing plate at speed setting 9, overnight. The insoluble material was 
washed in 30% (w/v) glycerol then stored in water at room temperature before 
diffraction. The sample contained a mixture of fibrils and crystals. 

Nanocrystals of PreNAC (47GVVHGVTTVAsg) were grown from synthetic pep- 

tide purchased from InnoPep. Crystallization trials of synthesized peptide were pre- 
pared in batch. Peptide was weighed and dissolved in sterile-filtered 50mM 
phosphate buffer pH 7.0 with 0.1% DMSO at a concentration of 5mg ml‘. This 
solution was shaken at 37 °C ona Torrey Pines orbital mixing plate at speed setting 9, 
overnight. 
Data collection and processing. X-ray diffraction data from microcrystals of 
SubNACore were collected using synchrotron radiation at the Advanced 
Photon Source, Northeast Collaborative Access Team micro focus beam line 
24-ID-E. The beam line was equipped with an ADSC Quantum 315 CCD 
detector. Data from a single crystal were collected in 5° wedges at a wavelength 
of 0.9791 A using a 51m beam diameter. We used data from three different 
sections along the needle axis. The crystals were cryo-cooled (100K) for data 
collection. Data were processed and reduced using Denzo/Scalepack from the 
HKL suite of programs”. 

Electron diffraction data from nanocrystals of NACore and PreNAC were 
collected using MicroED techniques*°. These nanocrystals typically clump 
together. To break up the clumps, an approximately 100 il volume of nanocrys- 
tals was placed in a sonication bath for 30 min. Nanocrystals were deposited onto 
a Quantifoil holey-carbon EM grid in a 2-3 pl drop after appropriate dilution, 
which optimized for crystal density on the grid. All grids were then blotted and 
vitrified by plunging into liquid ethane using a Vitrobot Mark IV (FEI), then 
transferring to liquid nitrogen for storage. Frozen hydrated grids were transferred 
to a cryo-TEM using a Gatan 626 cryo-holder. Diffraction patterns and crystal 
images were collected using an FEG-equipped FEI Tecnai F20 TEM operating at 
200 kV and recorded using a bottom mount TVIPS F416 CMOS camera with a 
sensor size of 4,096 X 4,096 pixels, each 15.6 X 15.6 ttm. Diffraction patterns were 
recorded by operating the detector in rolling shutter mode with 2 X 2 pixel bin- 
ning, producing a final image 2,048 X 2,048 pixels in size. Individual image 
frames were taken with exposure times of 3-4 per image, using a selected area 
aperture with an illuminating spot size of approximately 1 um. This geometry 
equates to an electron dose of less than 0.1 e~ per A? per second. During each 
exposure, crystals were continuously rotated within the beam at a rate of 0.3° per 
second, corresponding to 1.2° wedge per frame. Diffraction data were collected 
from several crystals each oriented differently with respect to the rotation axis. 
These data sets each spanned wedges of reciprocal space ranging from 40° to 80°. 

X-ray diffraction data from nanocrystals of NACore were collected using XFEL 
radiation at the CX] instrument (Coherent X-ray Imaging) at the Linear Coherent 
Light Source (LCLS)-SLAC. The photon energy of the X-ray pulses was 8.52 keV 
(1.45 A). Each 40 fs pulse contained up to 6 X 10"! photons at the sample posi- 
tion, taking into account a beam line transmission of 60%. The diameter of the 
beam was approximately 1 j1m. We used a concentration of approximately 25 ll 
of pelleted material suspended in 1 ml water. The sample was injected into the 
XFEL beam using a liquid jet injector and a gas dynamic virtual nozzle“. The 
micro jet width was approximately 4 um and the flow rate was 40 pl min '. The 
sample caused noticeable sputtering of the liquid jet. XFEL data were processed 
using cctbx.xfel*“*. 

Calibration of the sample to detector distance in MicroED was accomplished 
using a polycrystalline gold standard and by referencing the prominent reflec- 
tions in the electron diffraction experiment with the corresponding reflections in 
the XFEL data. Calibration of the x/y locations of the 64-tile CSPAD detector was 
performed by cctbx.xfel by refining the optically measured tile positions against a 
thermolysin data set*. 

To gain compatibility with conventional X-ray data processing programs, the 
MicroED diffraction images were converted from tiff or TVIPS format to the 
SMV crystallographic format. We used XDS to index the diffraction images”, and 
XSCALE for merging and scaling together data sets originating from different 
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crystals. For NACore, data from four crystals were merged, while for PreNAC, 
data from three crystals were merged to assemble the final data sets (see Extended 
Data Table 1). 

Structure determination. The molecular replacement solution for SubNACore 
was obtained using the program Phaser*’. The search model consisted of a geo- 
metrically ideal B-strand composed of nine alanine residues. Crystallographic 
refinements were performed with the program Refmac”. 

The molecular replacement solution for NACore was obtained using the program 
Phaser**. The search model consisted of the SubNACore structure determined prev- 
iously. Crystallographic refinements were performed with the program Phenix”? 
and Buster®’. 

The molecular replacement solution for PreNAC was obtained using the pro- 
gram Phaser**. The search model consisted of a geometrically ideal B-strand 
composed of six residues with sequence GVTTVA. Crystallographic refinements 
were performed with the program Phenix” and Refmac”. 

Model building for all segments was performed using COOT*. Data proces- 

sing and refinement statistics are reported in Extended Data Table 1. The coor- 
dinates of the final models and the structure factors have been deposited in the 
Protein Data Bank with PDB code 4RIK for SubNACore, 4RIL for NACore, and 
4ZNN for PreNAC. The structures were illustrated using Pymol®?. 
Protein expression and purification. The human wild-type o-synuclein con- 
struct has been previously characterized™ (pRK172, ampicillin, T7 promoter) 
with sequence: MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVL 
YVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAG 
SIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDPDNEAYEMPSEEGYQDY 
EPEA. 

Full length «-synuclein was purified according to published protocols”. The 
a-synuclein construct was transformed into Escherichia coli expression cell line 
BL21 (DE3) gold (Agilent Technologies) for wild-type o-synuclein protein 
expression. A single colony was incubated into 100 m1 LB Miller broth (Fisher 
Scientific) supplemented with 100 jg ml ' ampicillin (Fisher Scientific) and grown 
overnight at 37 °C. One litre of LB (Miller) supplemented with 100 1g ml’ ampi- 
cillin in 2-1 shaker flasks was incubated with 10 ml of overnight culture and grown 
at 37 °C until the culture reached OD¢00 ~ 0.6-0.8 as measured by a BioPhotometer 
UV/VIS Photometer (Eppendorf). IPTG (Isopropyl B-p-1-thiogalactopyranoside) 
was added to a final concentration of 0.5 mM, and grown for 4-6 h at 30°C. Cells 
were harvested by centrifugation at 5,500g for 10 min at 4 °C. The cell pellet was 
frozen and stored at —80 °C. 

The cell pellet was thawed on ice and resuspended in lysis buffer (100 mM Tris- 
HCl pH 8.0, 500 mM NaCl, 1 mM EDTA pH 8.0) and lysed by sonication. Crude 
cell lysate was clarified by centrifugation at 15,000g for 30 min at 4°C. The 
clarified cell lysate was boiled and cell debris was removed by centrifugation. 
Protein in the supernatant was precipitated in acid at pH 3.5 through addition 
of HCl by titration to protein solution on ice while stirring then centrifuged for an 
additional 15,000g for 30 min at 4 °C. Supernatant was dialysed against buffer A 
(20mM Tris-HCl, pH 8.0). After dialysis the solution was filtered through a 
0.45 jum syringe (Corning) before loading onto a 20ml HiPrep Q HP 16/10 
column (GE Healthcare). The Q-HP column was washed with five column 
volumes of buffer A and protein eluted using a linear gradient to 100% in five 
column volumes of buffer B (20 mM Tris-HCl, 1 M NaCl, pH 8.0). Protein eluted 
at around 50-70% buffer B; peak fractions were pooled. Pooled samples were 
concentrated approximately tenfold using Amicon Ultra-15 centrifugal filters. 
Approximately 5 ml of the concentrated sample was loaded onto a HiPrep 26/60 
Sephacryl S-75 HR column equilibrated with filtration buffer (25 mM sodium 
phosphate, 100 mM NaCl, pH 7.5). Peak fractions were pooled from the gel 
filtration column and dialysed against 5 mM Tris-HCl, pH 7.5, concentrated to 
3mgml ’. These were filtered through a 0.2 1m pore size filter (Corning) and 
stored at 4°C. 

Recombinantly expressed full-length «-synuclein with an N-terminal acetyla- 
tion was prepared and purified in the following way based on a protocol detailed 
in ref. 16 The o-synuclein plasmid was co-expressed with a heterodimeric protein 
acetylation complex from Schizosaccharomyces pombe to acetylate the N ter- 
minus (pACYC-DUET, chloramphenicol, T7 promoter)**. The two vectors were 
co-transformed into E. coli BL21 (DE3) using media containing both ampicillin 
and chloramphenicol. Cell cultures were grown in TB media containing ampi- 
cillin and chloramphenicol and induced to express -synuclein with 0.5 mM 
IPTG overnight at 25 °C. Cells were harvested by centrifugation, the cell pellet 
then resuspended in lysis buffer (100mM Tris-HCl pH 8.0, 500mM NaCl, 
1mMEDTA pH 8.0, and 1mM phenylmethylsulfonyl fluoride) and cells lysed 
using an Emulsiflex homogenizer (Avestin). The lysate was boiled and debris 
removed by centrifugation. A protein fraction was also removed by precipitation 
at low pH on ice followed by centrifugation. The remaining supernatant was pH 
adjusted by titration and dialysed against buffer A (20mM Tris-HCl, pH 8.0, 
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1mMDTT, 1mMEDTA, pH 8.0). The resulting protein solution was loaded 
onto a 5ml Q-Sepharose FF column (GE Healthcare) equilibrated with buffer 
A and eluted against a linear gradient of buffer B (1 M NaCl, 20 mM Tris-HCl, pH 
8.0, 1mM DTT, 1mMEDTA, pH 8.0). Fractions containing «-synuclein were 
identified using SDS-PAGE, collected, concentrated and further purified by size 
exclusion (Sephacryl S-100 16/60, GE Healthcare) in 20mM Tris, pH 8.0, 
100 mM NaCl, 1mMDTT, 1 mM EDTA. Purity of fractions was assessed by 
SDS-PAGE. 

Acetylated protein was characterized by LC-MS*°°*. Expected average mass: 

14460.1 Da for #-synuclein and 14502.1 Da for acetylated «-synuclein. Observed 
average mass: 14464.0 Da for «-synuclein and 14506.0 Da for acetylated o-synu- 
clein (Extended Data Fig. 6). The shift of 4 Da between observed and expected 
average masses is due to instrumental error. 
Fibril formation and detection. Purified o-synuclein in 50 mM Tris, 150 mM KCl, 
pH 7.5 was shaken at a concentration of 500 [1M at 37 °C in a Torey Pine shaker. To 
form the fibrillar samples of SubNACore and NACore, lyophilized peptides were 
dissolved to a final concentration of 500 uM in 5mM lithium hydroxide, 20 mM 
sodium phosphate pH 7.5 and 0.1 M NaCl. All samples were shaken at 37°C in a 
Torey Pine shaker for 72 h. Freshly dissolved samples were prepared by dissolving 
lyophilized peptides immediately before addition to cells for assays. 

Turbidity measurements were used to compare NACore and SubNACore 
aggregation. Peptide samples were freshly dissolved to 1.6 mM in a sample buffer 
with 5mMLiOH and 1% DMSO and then filtered through a PVDF filter 
(Millipore, 0.1 j1m). Measurements were performed using a black NUNC 96 well 
plate with 200 pl of sample per well (3-4 replicates per sample). The plate was 
agitated at 37°C, with a 3mm rotation diameter, at 300 r.p.m. in a Varioskan 
microplate reader (Thermo). Absorbance readings were recorded every 3-15 min 
at 340 nm. 

Negative-stain transmission electron microscopy. Cytotoxicity samples were 
evaluated for presence of fibrils by electron microscopy. In brief, 5-ul samples 
were spotted directly on freshly glow-discharged carbon-coated electron micro- 
scopy grids (Ted Pella). After 4 min incubation, grids were rinsed twice with 5 ul 
distilled water and stained with 2% uranyl acetate for 1 min. Specimens were 
examined on an FEI] T12 electron microscope. 

Fibril diffraction. Fibrils formed from purified «-synuclein with and without 
N-terminal acetylation were concentrated by centrifugation, washed, and 
oriented while drying between two glass capillaries. Likewise, NACore nanocrys- 
tals were also concentrated, washed with nanopure water, and allowed to orient 
while drying between two glass capillaries. The glass capillaries holding the 
aligned fibrils or nanocrystals were mounted on a brass pin for diffraction at 
room temperature using 154A X-rays produced by a Rigaku FRE+ rotating 
anode generator equipped with an HTC imaging plate. All patterns were collected 
at a distance of 180mm and analysed using the Adxv software package*’”. A 
simulated pattern from the full length «-synuclein model presented in Fig. 3 
was obtained by calculating structure factors from the model using the sfall 
module from CCP4, assigning the model a unit cell of 200 x 4.74 x 200A. 
Cylindrical averaging of these structure factors about the fibre axis (y axis) dir- 
ection produced a set of simulated fibril diffraction intensities. 

Cytotoxicity assays. Adherent PC12 cells (ATCC CRL-1721) were cultured in 
ATCC-formulated RPMI-1640 medium (ATCC 30-2001) supplemented with 
10% horse serum and 5% fetal bovine serum and plated at 10,000 cells per well 
to a final volume of 90 pl. All MTT assays were performed with Cell Titer 96 
aqueous non-radioactive cell proliferation kit (MTT, Promega cat. no. 4100). 
Cells were cultured in 96-well plates for 20 h at 37 °C in 5% CO) before addition 
of samples (Costar cat. no. 3596). 10 ul of sample was added to each well contain- 
ing 90 pl of medium and incubated for 24h at 37 °C in 5% CO). Then, 15 pil dye 
solution (Promega cat. no. 4102) was added into each well, followed by incubation 
for 4h at 37°C in 5% CO . This was followed by the addition of 100 il solubil- 
ization Solution/Stop Mix (Promega cat. no. 4101) to each well. After 12 h incuba- 
tion at room temperature, the absorbance was measured at 570 nm. Background 
absorbance was recorded at 700 nm. The data was normalized with cells treated 
with 1% (w/v) SDS to 0% reduction, and cells treated with sample buffer to 100% 
reduction. 

Lactose dehydrogenase assays were done using CytoTox-ONE Homogeneous 
Membrane Integrity, (Promega, cat. no. G7890) as per manufacturer’s instructions. 


In brief, cells were plated in 96-well, black-wall, clear-bottom (Fisher cat. no. 07- 
200-588) tissue culture plates at 10,000 cells per well to a final volume of 90 ll. Cells 
were incubated for an additional 20h at 37°C in 5% CQ, before addition of 
samples. Next, 10 pl of sample was added to each well, following which the cells 
were incubated for another 24h. 100 pl of reagent was added to each well and 
incubated for 15 min at room temperature. The addition of 50 pl of stop solution 
stopped the reaction. Fluorescence was measured in a Spectramax M5 (Molecular 
Devices) using excitation and emission wavelengths of 560nm and 590nm, 
respectively. Data was normalized using cells treated with buffer as 0% release 
and 0.1% Triton X-100 as 100% release. 

Construction of a-synuclein A53T fibril model. A model for full-length «-synu- 
clein A53T mutant fibrils that are involved in the early onset of PD was con- 
structed using a section of the NACore crystal packing as a scaffold. Figure 3 
illustrates the four copies of the NACore segment used for the scaffold. The crystal 
structure of the two inner strands was adapted with minimal changes as the 
analogous segments 68-78. The structure of PreNAC was matched onto the weak 
interface of the NACore structure. Only 4 of the 11 side chains in the segment 46- 
56 differ from those in the NACore segment 68-78 and residues V51-V55 can be 
closely matched to V71-V75. Hence the model for both the homotypic interface 
and heterotypic interface in the full-length fibre model closely resemble those 
observed in the NACore structure. The regions outside these segments were 
adapted from the structure of the native o-synuclein fold, PDB ID: 2KKW*. 
These segments were spliced in manually using COOT. The models were energy 
minimized and temperature annealed using the program CNS” with hydrogen- 
bonding potential®. The simulated fibre diffraction pattern calculated from this 
model shows prominent reflections that agree with those observed in fibre dif- 
fraction patterns of NACore, o-synuclein, and N-acetyl «-synuclein (Extended 
Data Table 2). 
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Extended Data Figure 1 | A schematic representation of a-synuclein, 
highlighting the NAC region (residues 61-95) and within it the NACore 
sequence (residues 68-78). A series of bars span regions of «-synuclein that 
are of interest to this work. Among the three synuclein paralogues (a, B and y), 
the region whose sequence is unique to «-synuclein is shown as a blue bar 
(residues 72-83) that overlaps with a large portion of NACore. Segments 
investigated in ref. 23 are also shown. These span a variety of regions within 


NACore. Two of the segments we investigate here, SubNACore and NACore, 
are shown in this context. Only one of the segments studied ref. 23 is an exact 
match to our NACore sequence, and only this segment is both toxic and 
fibrillar. The sequences of «-synuclein, B-synuclein, and y-synuclein are 
shown as a reference with conserved residues in bold and the NACore sequence 
in red. 
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>a 
Extended Data Figure 2 | NACore difference density maps calculated after molecule near a threonine side chain (red circle); a second water was located 
successful molecular replacement using the SubNACore search model during the refinement process. The blue mesh represents 2F, — F. density 
clearly revealed the positions of the missing residues (positive F, — F. contoured at 1.20. The green and red mesh represent F, — F, densities 


density at N and C termini corresponding to G68 and A78) and one water —_ contoured at 3.0 and —3.0o, respectively. All maps were o4-weighted". 
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Extended Data Figure 3 | The crystal structure of NACore reveals pairs of 
sheets as in the spines of amyloid fibrils. a, NACore’s two types of sheet- 
sheet interfaces: a larger interface (orange, 268 A* of buried accessible surface 
area per chain) we call interface A, and a weaker interface (blue, 167 A?) we call 
interface B. The crystal is viewed along the hydrogen-bonding direction 
(crystal ‘b’ dimension). The red lines outline the unit cell. b, The van der Waals 
packing between sheets. The sheets are related by a 2; screw axis denoted in 
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black. The only gaps left by the interface are filled with water molecules which 
hydrogen-bond to the threonine residues (partially showing aqua spheres). 
The shape complementarity of both interfaces is 0.7. The viewing direction 
is the same as in a. c, Orthogonal view of the fibrillar assembly. The protofibril 
axis, coinciding with the 2, screw axis designated by the arrow, runs vertically 
between the pairs of sheets. 


©2015 Macmillan Publishers Limited. All rights reserved 


ARTICLE 


a ; NACore | 


Interface B 
Interface A 

1 2 3 4 5 6 7 8 9 All 
Amino Acid A Vv Vv T G Vv T A Vv - 


RMSD_res (A) 0.183 0.171 0.118 0.103 0.185 0.225 0.160 0.351 1.316 0.499 
RMSD_ca (A) 0.174 0.050 0.083 0.071 0.183 0.105 0.156 0.201 0.555 0.226 


b PreNAC 


NACore 


1 2 3 4 5 All 
Amino Acid G Vv T TIA Vv 
RMSD_ca(A) 0.919 0.214 0.503 0.319 3.199 1.515 


Extended Data Figure 4 | Comparison of the crystal packing for NACore RMSD_res is an all-atom comparison between residue pairs, while RMSD_ca 
and SubNACore. a, The face-to-face interactions are virtually the same forthe | compares only Cx pairs. b, PreNAC (blue) is compared with NACore (orange). 


pairs of NACore segments (orange chains) in its crystal structure and the Five residues from each strand are shown in darker colour and the r.m.s.d. 
SubNACore segments (white chains) in its structure (interfaces AandBshown __ values between their Cx pairs are compared in the table below. The PreNAC- 
in gold and blue, respectively). The table below shows the pairwise r.m.s.d. NACore interaction mimics the weaker interface B in the NACore structure. 


values comparing the nine residues shared in common between the structures. 
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Extended Data Figure 5 | Intense reflections common among the NACore 
and the two polymorphs of full length a-synuclein suggest common 
structural features. Common structural features are illustrated here on the 
crystal packing diagrams of NACore. The (0,0,2) planes approximate the 
separation between sheets in interface A (orange). The (0,2,0), (—1,1,1), and 
(1,1,]) reflections are intense because the corresponding Bragg planes 


recapitulate the staggering of strands from opposing sheets. The red lines 
correspond to the unit cell boundaries and all planes are shown in black. 

The location of the unit cell origin is indicated by “O’. The unit cell dimensions 
a, b, and c are labelled. Bragg spacings (spacings between planes), indicated by 
‘d’, are indicated in angstroms. 
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Extended Data Figure 6 | Mass spectrometry analysis of recombinantly the N-terminally acetylated form is appropriately shifted with respect to the 
expressed, full-length a-synuclein, with and without N-terminal acetylation. native form of the protein (14464.0 Da for o-synuclein and 14506.0 Da for 
The mass profile of wild-type full length o:-synuclein (left) is compared to acetylated o-synuclein), within a margin of error of 4 Da. 


that of an N-terminally acetylated form of the protein (right). The mass shift for 
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Segment SubNACore NACore PreNAC 
6es9AVVTGVTAV77_| ssGAVVTGVTAVAs6 | 47GVVHGVTTVAs6 
Data collection 
Radiation source Synchrotron Electron Electron 
Space group C2 C2 P21 
Cell dimensions 
a,b,c (A) 61.9, 4.80, 17.3 70.8, 4.82, 16.79 17.9, 4.7, 33.0 
a,B,y (°) 90, 104.1, 90 90, 105.7, 90 90, 94.3, 90 
Resolution (A 1.85 (1.95-1.85 1.43 (1.60-1.43 1.41 (1.56-1.41 
Wavelength (A 0.9791 0.0251 0.0251 
Rmerge 0.117 (0.282) 0.173 (0.560) 0.236 (0.535) 
Rriim. 0.135 (0.322) 0.199 (0.647) 0.264 (0.609) 
Rp.im. 0.065 (0.154) 0.093 (0.311) 0.185 (0.305) 
I/ol 5.2 (2.7) 5.5 (2.5) 4.6 (1.8) 
CC1i2 (%) 99.5 (97.8) 99.4 (92.3) 96.7(74.0) 
Completeness (%) 97.9 (98.3) 89.9 (82.6) 86.9 (69.6) 
Multiplicity 4.1 (4.0) 44 (4.3) 3.7 (3.5) 
Refinement 


Resolution (A) 


1.85 (2.07-1.85) 


1.43 (1.60-1.43) 


1.41 (1.41-1.57) 


No. reflections 


470 (125) 


1073 (245) 


1006 (239) 


Rwork 0.176 (0.248) 0.248 (0.253) 0.235 (0.336) 
Rtree 0.221 (0.286) 0.275 (0.331) 0.282 (0.329) 
CCwork 0.964 (0.896) 0.947(0.618) 0.937(0.335) 
CCrree 0.889 (0.993) 0.986(0.269) 0.967(0.361) 
No. atoms 

Protein 57 66 66 

Water 3 2 4 
B-factors (A*) 

Protein 17.1 9.0 16.1 

Water 27.6 2.7 24.6 
Wilson B (A 11.8 10.3 13.8 
R.m.s deviations 

Bond lengths (A 0.005 0.010 0.020 

Bond angles (°) 1.1 1.6 2.0 
PDB ID code 4RIK 4RIL 4ZNN 
EMDB ID code EMD-3028 EMD-3001 


*Highest resolution shell is shown in parenthesis. Data quality is indicated 


by the redundancy independent merging R factor (r.i.m) and the precision indicating merging R factor (p.i.m.). 
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Extended Data Table 1 | Statistics of data collection and atomic refinement for NACore, its fragment SubNACore, and PreNAC 
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Extended Data Table 2 | Comparison of reflections observed in powder diffraction of fibrils of full-length a-synuclein, N-acetyl a-synuclein, 
and a synthetic pattern calculated from our a-synuclein model, to aligned nanocrystals of NACore. 


Segment Reflections (A) 

NACore 2.21, 2.26, 2.39, 2.52, 2.61, 2.68, 2.78, 3.02, 3.12, 3.34, 3.56, 3.86, 
GAVVTGVTAVA 4.34, 4.57, 5.16, 5.98, 7.56, 8.19, 10.46, 11.63, 13.29, 16.61 

a-syn 2.39, 4.64, 6.82, 8.29, 10.06 


N-acetyl a-syn 2.38, 4.62, 8.18, 9.80, 11.90 
2.23, 2.25, 2.35, 3.29, 3.63, 3.70, 3.95, 4.08, 4.56, 4.68, 8.36, 8.69, 


Simulated a-syn 21.76, 24.47, 27.61, 31.67 


Bold reflections are strong and common to all three samples. Colours of the labelled reflections match those in Fig. 2. 
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Structure of the toxic core of a-synuclein 


from invisible crystals 


Jose A. Rodriguez'*, Magdalena I. Ivanova!*+, Michael R. Sawaya'*, Duilio Cascio, Francis E. Reyes**, Dan Shi’, 

Smriti Sangwan!, Elizabeth L. Guenther’, Lisa M. Johnson!, Meng Zhang", Lin Jiang'+, Mark A. Arbing', Brent L. Nannenga’, 
Johan Hattne?, Julian Whitelegge’®, Aaron S. Brewster*, Marc Messerschmidt°+, Sébastien Boutet®, Nicholas K. Sauter*, 
Tamir Gonen? & David S. Eisenberg’ 


The protein a-synuclein is the main component of Lewy bodies, the neuron-associated aggregates seen in Parkinson 
disease and other neurodegenerative pathologies. An 11-residue segment, which we term NACore, appears to be 
responsible for amyloid formation and cytotoxicity of human a-synuclein. Here we describe crystals of NACore that 
have dimensions smaller than the wavelength of visible light and thus are invisible by optical microscopy. As the crystals 
are thousands of times too small for structure determination by synchrotron X-ray diffraction, we use micro-electron 
diffraction to determine the structure at atomic resolution. The 1.4 A resolution structure demonstrates that this method 
can determine previously unknown protein structures and here yields, to our knowledge, the highest resolution 
achieved by any cryo-electron microscopy method to date. The structure exhibits protofibrils built of pairs of 
face-to-face B-sheets. X-ray fibre diffraction patterns show the similarity of NACore to toxic fibrils of full-length 
a-synuclein. The NACore structure, together with that of a second segment, inspires a model for most of the ordered 
portion of the toxic, full-length a-synuclein fibril, presenting opportunities for the design of inhibitors of a-synuclein 


fibrils. 


The presynaptic protein «-synuclein, found in both soluble and mem- 
brane-associated fractions of the brain, aggregates in Parkinson dis- 
ease (PD). These aggregates are the main component of Lewy bodies, 
the defining histological feature of this neurodegenerative disease, and 
have been shown to accompany neuronal damage’. Two other obser- 
vations point to aggregated o-synuclein as a molecular cause of PD’. 
The first is that families with inherited forms of PD carry mutations in 
o-synuclein, such as A53T, and abundant Lewy bodies**. The second 
is that families with duplicated or triplicated genes encoding a-synu- 
clein develop early-onset PD, presumably because at high local con- 
centrations -synuclein is forced into amyloid®’. 

Our focus is on a central segment of %-synuclein, residues 68-78, 
that we term NACore (Fig. 1), because of its critical role in both the 
aggregation and cytotoxicity of «-synuclein. NACore lies within a 35- 
residue domain of o-synuclein termed NAC (non-amyloid-B com- 
ponent, originally reported to be deposited with amyloid-B in the 
brains of Alzheimer’s disease patients), which has been established 
as necessary and sufficient for the aggregation and toxicity of 
a-synuclein* *(Extended Data Fig. 1). For example, deletion of resi- 
dues 71-82 prevents aggregation of «-synuclein in vitro, and abolishes 
both its aggregation and neurotoxicity in a Drosophila model of PD”. 
Yet this segment in isolation from the rest of o%-synuclein readily 
forms amyloid fibrils and is highly cytotoxic'*"*. Also, B-synuclein, 
the close homologue of «-synuclein, which does not aggregate and is 
not found in Lewy bodies, differs in sequence from o-synuclein prin- 
cipally by the lack of residues 74-84 that are part of NACore’. 

Segments outside NAC also influence the aggregation of o-synuclein 
and have been associated with fibril structure'*”. In brain extracts from 


patients with multiple system atrophy, the core of a-synuclein fibrils 
extends approximately from residue 30 to 100’*. Also the A53T mutation 
of o-synuclein can accelerate its transition into the amyloid state, and 
hence accelerate PD’’. This mutation was found to induce the onset of 
PD at an early age”, and consistent with this, «-synuclein containing this 
A53T mutation forms fibrils in vitro more rapidly than the wild type’. 
Thus we carried out screens for crystals of peptide segments within the 
NAC domain and adjacent regions, seeking structural information on 
the molecular basis of aggregation and toxicity of «-synuclein. 


Structure determination by MicroED 
Extensive crystal screens of two segments, NACore, residues 
6b8GAVVTGVTAVA zs, and PreNAC, 4,GVVHGVTTVAs., seemingly 
produced non-crystalline, amorphous aggregates. But upon examina- 
tion using electron microscopy, we found the aggregates to be clusters 
of elongated nanocrystals only 50-300 nm in cross section and thus 
invisible by conventional light microscopy (Fig. 1). We confirmed well- 
ordered crystallinity of NACore at both the SACLA and LCLS free 
electron lasers. We also found that a nine-residue fragment within 
the NACore, which we term SubNACore, 65,AVVIGVTAV7,, yielded 
crystals 1,000-10,000 times larger in volume than the NACore nano- 
crystals (Fig. 1). We were therefore able to apply synchrotron meth- 
ods” to these larger crystals to determine the structure of their 
amyloid-like fibrils. Although this nine-residue fragment is missing 
only two residues compared with NACore, it is not as toxic”’, offering 
some insight into the toxicity of «-synuclein, as described below. 

To determine the structure of the invisible crystals of NACore and 
PreNAC, we turned to micro-electron diffraction (MicroED)™*”*. In 
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Figure 1 | NACore (residues 68-78) is the fibril-forming core of the NAC 
domain of full-length a-synuclein. a, Top, segments identified as B-strands 
by electron paramagnetic resonance (EPR), solid state nuclear magnetic 
resonance (ssNMR), hydrogen-deuterium exchange (HD), and site directed 
spin labelling (SDSL)'”*°". Bottom, predictions of the propensity of six-residue 
segments to form amyloid fibrils. The vertical axis indicates the propensity 
of steric zipper formation in Rosetta energy units**. Zipper-forming segments 
are predicted where bars cross the —23 kcal mol” ' threshold marked by the 
dashed line. Blue-to-red color gradient indicates weak-to-strong propensity of 
steric zipper formation. The A53T early-onset Parkinson mutation is indicated 
by a red arrow and red letter T. b, The miniscule size of the preNAC and 
NACore crystals used for MicroED is illustrated by this comparison to 
SubNACore microcrystals. Scale comparisons are illustrated on two 
magnifications using phase contrast light microscope images and electron 
micrographs, in which individual NACore and PreNAC nanocrystals are 
indistinguishable by light microscopy. 


MicroED, an extremely low-dose electron beam is directed onto a 
nanocrystal within a transmission electron microscope under cryo- 
genic conditions, yielding diffraction patterns such as those in Fig. 2. 
As the wavelength used in our experiments at 200 keV is very small 
(0.025 A), the Ewald sphere is essentially flat, resulting in diffraction 
patterns that closely resemble a 2D slice through 3D reciprocal space. 
As the crystal is continuously rotated in the beam, a series of such 
diffraction patterns is collected”*. Scaling together diffraction data 
collected from multiple crystals produces a full 3D diffraction data 
set. MicroED has been successfully applied to the well-known struc- 
tures of hen egg-white lysozyme**”*, bovine liver catalase”” and Ca** - 
ATPase”®. But NACore and PreNAC are the first previously unknown 
structures determined by MicroED. 

For NACore and PreNAC, we collected MicroED patterns from 
nano-crystals that lay preferentially oriented, flat on the surface of a 
holey carbon Quantifoil grid, in a frozen-hydrated state. Grids were 
first screened for appropriately sized crystals, and candidate crystals 
screened for diffraction. We used crystals showing strong diffraction 
for data collection by continuous unidirectional rotation about a fixed 
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Figure 2 | Diffraction from NACore nanocrystals is similar to that from full 
length o-synuclein fibrils. a, Single-crystal electron diffraction pattern 
obtained during MicroED data collection (see text). Equally spaced concentric 
rings denote resolution shells. The highest resolution spot is at 1.52 A (arrow). 
The inset shows the over-focused image of the diffracting crystal (arrow), 
which is ~1,480 X 200 X 200 nm. Orientation of the reciprocal cell axes are 
indicated by the arrows labelled a*, b*, c*. b, Composite of fibril diffraction 
patterns from o-synuclein («-syn) preparations and NACore. Full-length 
a-synuclein reveals reflections that match those from NACore and 
N-terminally acetylated «-synuclein. The two patterns of full-length 
o-synuclein share with NACore three major peaks denoted by arrows: 8.2 A 
(orange), 4.6 A (blue), and 2.4 A (green). The origin of these peaks can be traced 
to the (0,0,2), [(1,1,1),(—1,1,1)], and (0,2,0) planes in the NACore structure, 
respectively. We attribute the strong 8.2 A reflection to the spacing between 
adjacent pairs of f-sheets. 


axis, acquiring a series of diffraction frames at fixed time intervals”. 
The needle-shaped crystals typically exceeded the length needed for 
MicroED; those that were unbent and 100 to 300 nm wide produced 
the best diffraction patterns. Data from multiple crystals were inte- 
grated, scaled and merged together (Extended Data Table 1). 

The multi-crystal NACore and PreNAC data sets were phased by 
molecular replacement, using the atomic model of SubNACore and 
an ideal B-strand model, respectively, as probes. Residues of NACore 
which were missing from the SubNACore probe were clearly revealed 
in a difference density map calculated from NACore observed struc- 
ture factor amplitudes and phases from the SubNACore probe struc- 
ture (Extended Data Fig. 2). After subsequent refinement, two water 
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Figure 3 | Structure of the amyloid core of a-synuclein. a, The crystal 
structure of NACore (orange) reveals pairs of sheets as in the spines of amyloid 
fibrils. The A53T mutation (black) is shown in PreNAC (blue). The sheets in 
both structures are related by the 2, fibril axes shown in black. The gaps left by 
the interface are filled with water molecules which hydrogen-bond to the 
threonine residues (partially showing aqua spheres). b, c, Orthogonal views of 
the fibrillar assemblies. d, A speculative model of an o-synuclein protofibril 
containing the A53T mutation (black), where the strong interface of NACore 
(orange) forms the core of the fibril and its weaker interface interacts with 
PreNAC (blue). e, The locations of 5 out of a possible 73 protons are suggested 
by small, positive F, — F, density (green contoured at 2.80, shown by arrows). 
The blue mesh is 2F, — F- density contoured at 1.40. 


molecules, and several hydrogen atoms, were observed (Fig. 3e). Full 
models of NACore and PreNAC were refined against the MicroED 
data, producing structures at 1.4 A resolution with acceptable R fac- 
tors (Extended Data Table 1). Electron scattering factors were used in 
the refinement calculations”. 
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NACore structure 


The structure of the NACore peptide chain is a nearly fully extended 
B-strand (Fig. 3 and Extended Data Fig. 3). These NACore strands stack 
in-register into B-sheets, as had been predicted by site-directed spin 
labelling'*'”. The sheets are paired (Fig. 3b), as is usual in amyloid spines, 
and the pairs of sheets form typical steric-zipper protofilaments (Fig. 3c), 
previously seen as the spines in many amyloid-like fibrils formed from 
short segments of fibril-forming proteins*’. The unusual features of this 
steric zipper are that the 11-residue width of the zipper is longer than has 
been previously observed”, and each pair of sheets contains two water 
molecules, each associated with a threonine side chain within the inter- 
face. Most steric zippers are completely dry. Also, in our crystals of 
NACore, each sheet forms two snug interfaces: interface A, with 
268 A” of buried accessible surface area per chain, is more extensive 
and presumably stronger than interface B (167 A’), because the terminal 
residues of the chains in opposing sheets bend towards each other (Fig. 3 
and Extended Data Fig. 4). The structure of PreNAC reveals a peptide 
chain that forms a f-strand kinked at Gly51. These strands are arranged 
into pairs of B-sheets that, like the NACore structure, interdigitate to 
form steric zipper protofilaments (Fig. 3). Of special note, a five-residue 
segment of PreNAC (;;GVTTV55) differs in only one residue from a five- 
residue segment of NACore (,3GVTAV-,), and their «-carbons super- 
impose closely with a root mean square deviation (r.m.s.d.) of 1.5A 
(Extended Data Fig. 4). This means that the weaker interface B of 
NACore mimics a hypothetical interface between NACore and 
PreNAC (Fig. 3d). 


Relevance of NACore to Parkinson disease 


The relationship of the structure of NACore to fibrils of full length 
a-synuclein is established by the resemblance of their diffraction pat- 
terns. Specifically, the fibre diffraction pattern of aligned fibrils of full- 
length and N-terminally acetylated* «-synuclein protein display the 
same principal peaks as the diffraction of aligned NACore nanocrys- 
tals (Fig. 2). All three fibrils display the strong reflection at 2.4 A in 
their diffraction patterns. As seen in Fig. 3 and Extended Data Fig. 5 
this reflection arises in NACore because one f-sheet of the steric 
zipper is translated along the fibre axis with respect to the other 
B-sheet by 2.4 A, one half the 4.8 A spacing between f-strands, per- 
mitting the two sheets to interdigitate tightly together. All three share 
a strong 46A reflection, which in NACore results from both the 
stacking of B-strands and the staggering between adjacent -sheets 
of the steric zipper, while a shared reflection at near 8.2A probably 
arises from the distance between the adjacent pairs of B-sheets that 
make up the o-synuclein fibril (Fig. 2 and Extended Data Fig. 5). This 
comparison of fibre diffraction patterns (Extended Data Table 2) 
strongly suggests that the structure of NACore is similar to the spine 
of the toxic fibrils of full «-synuclein. 

The combined structures of NACore and PreNAC allow us to 
construct a speculative model for much of the ordered segments of 
the A53T early-onset mutant o-synuclein (Fig. 3d). Experimental 
support of this model comes from the agreement of its simulated fibre 
diffraction with the measured diffraction patterns of #-synuclein and 
N-acetyl o-synuclein fibrils, as well as aligned NACore nanocrystals 
(Extended Data Table 2). Above we hypothesized that the weaker 
interface B of NACore might mimic an intramolecular interaction 
of PreNAC with NACore (Fig. 3). In fact, the interacting side chains 
in the weaker NACore interface B (G73, T75 and V77) are identical to 
the side chains (G51, T53, V55) interacting in the hypothetical inter- 
face of PreNAC with NACore. Assuming that this interface actually 
forms in fibrils of the early-onset mutant A53T, we built the model 
shown in Fig. 3d. The hypothetical interface of this model offers 
a possible reason for a greater propensity of the A53T mutant to 
aggregate than the wild-type sequence, conceivably leading to the 
early onset of PD. 

The identity and structure of the cytotoxic amyloid formed by 
a-synuclein remains a subject of intensive research'?’'~**. The weight 
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Figure 4 | NACore aggregates faster than SubNACore and is more cytotoxic 
to cultured cells. a, b, Cytotoxicity of NACore, SubNACore and «-synuclein 
measured on PC12 cells using a 3-(4,5-dimethylthiazol-2-yl)-2,5- 
diphenyltetrazolium bromide (MTT) assay (a) and a lactate dehydrogenase 
(LDH)-release assay (b). In both assays NACore is more toxic than 
SubNACore. Also, shaken fibrils are more toxic than an equal concentration of 
freshly dissolved sample. Results shown as mean + s.e.m. based on triplicate 
samples. A t-test was used to measure statistical significance; *P < 0.05, 

***P < 0.001. c, Representative electron micrographs of NACore, SubNACore 


of evidence over the past decade has tilted scientific opinion from the 
fully developed amyloid fibrils found in Lewy bodies as the toxic 
entities to smaller, transient amyloid oligomers. Yet recently, quant- 
itative arguments have been put forward in favour of fibrils**. Our 
experiments of the cytotoxicity of NACore on PC12 cells (Fig. 4) are 
consistent with the view that fibrils are toxic: we find that NACore 
shaken and aggregated for 72 h displays abundant fibrils, is more toxic 
than freshly dissolved NACore (Fig. 4), and is comparably toxic to 
similarly aggregated full «-synuclein. We also find greater cytotoxicity 
of NACore than SubNACore, which is shorter by two residues. This is 
consistent with the more rapid fibril formation of NACore than of 
SubNACore (Fig. 4d). These observations do not rule out the forma- 
tion of a non-fibrillar, oligomeric assembly, present, but undetected, 
in our aggregated samples of NACore and «-synuclein. Of course, 
NACore is merely a fragment of full length o-synuclein, and lacks 
most of the membrane-binding motifs of the N terminus of the pro- 
tein, which have been implicated in membrane disruption*”™*. Yet it is 
clear that NACore is the minimum entity that recapitulates all the 
features of full length «-synuclein aggregation and toxicity. 


MicroED diffraction of invisible crystals 

The miniscule size of NACore crystals is typical of amyloid and also of 
various other biological crystals of interest. For amyloid crystals, our 
speculation is that the tiny size is a consequence of the natural twist of 
B-sheets that form the protofilaments of the fibrils. The crystal lattice 
restrains the twist, creating a strain in these crystals, which increases 
as crystals grow. Eventually this strain prevents further addition of 
B-strands, limiting the thickness of the needle crystals. In our experi- 
ence, longer segments (for example, 11 residues compared to 9 resi- 
dues) limit crystal growth even more; in the case of 11-residue 
NACore and 10-residue PreNAC, the strain produces nanocrystals, 
invisible by optical microscopy. These crystals are too small for 
mounting and conventional synchrotron data collection, but are 
ideally suited for analysis by MicroED. They are ~10'° times smaller 
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and «-synuclein (a-syn) samples tested for cytotoxicity. NACore and 
a-synuclein show abundant fibrils but SubNACore shows few. NACore also 
forms fibrils immediately upon dissolving, whereas SubNACore shows no 
fibres, but instead amorphous aggregates. Scale bar, 500 nm. d, NACore and 
SubNACore were aggregated in identical conditions and monitored by 
turbidity. NACore begins to aggregate after 15h while SubNACore forms no 
aggregates for up to 50 h. Electron microscopy of the samples at 50 h confirmed 
the turbidity readings (insets; scale bars, 2 1m), with error bars denoting 
standard deviation based on triplicate samples. 


than Perutz’s haemoglobin crystals and ~ 10’? times smaller than von 
Laue’s CuSO, crystal, which yielded the first X-ray diffraction pattern. 
Our structures of NACore and PreNAC demonstrate that MicroED is 
capable of determining new and accurate structures of biological 
material at atomic resolutions. This finding paves the way for applica- 
tions of MicroED to other biological substances of importance for 
which only nanocrystals can be grown. In our particular application, 
we have been able to learn the atomic arrangement of the core of the 
crucial NAC domain. This presents opportunities for structure-based 
design of inhibitors of amyloid formation of «-synuclein”. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Data reporting. No statistical methods were used to predetermine sample size. 
The experiments were not randomized. The investigators were not blinded to 
allocation during experiments and outcome assessment. 

Crystallization. Microcrystals of SubNACore (ggAVVTGVTAV,7) were grown 
from synthetic peptide purchased from CS Bio. Crystals were grown at room 
temperature by hanging drop vaporization. Lyophilized peptide was dissolved in 
water at 2.9mgml ' concentration in 48 mM lithium hydroxide. Peptide was 
mixed in a 2:1 ratio with reservoir containing 0.9 M ammonium phosphate, and 
0.1 M sodium acetate pH 4.6. 

Nanocrystals of NACore, ¢5GAVVTGVTAV3, were grown from synthetic 
peptide purchased from CS Bio. Ten batches of synthesized peptide (CSBio) at 
a concentration of 1 mg ml’ in sterile water were shaken at 37 °C on a Torrey 
Pines orbital mixing plate at speed setting 9, overnight. The insoluble material was 
washed in 30% (w/v) glycerol then stored in water at room temperature before 
diffraction. The sample contained a mixture of fibrils and crystals. 

Nanocrystals of PreNAC (47GVVHGVTTVAsg) were grown from synthetic pep- 

tide purchased from InnoPep. Crystallization trials of synthesized peptide were pre- 
pared in batch. Peptide was weighed and dissolved in sterile-filtered 50mM 
phosphate buffer pH 7.0 with 0.1% DMSO at a concentration of 5mg ml‘. This 
solution was shaken at 37 °C ona Torrey Pines orbital mixing plate at speed setting 9, 
overnight. 
Data collection and processing. X-ray diffraction data from microcrystals of 
SubNACore were collected using synchrotron radiation at the Advanced 
Photon Source, Northeast Collaborative Access Team micro focus beam line 
24-ID-E. The beam line was equipped with an ADSC Quantum 315 CCD 
detector. Data from a single crystal were collected in 5° wedges at a wavelength 
of 0.9791 A using a 51m beam diameter. We used data from three different 
sections along the needle axis. The crystals were cryo-cooled (100K) for data 
collection. Data were processed and reduced using Denzo/Scalepack from the 
HKL suite of programs”. 

Electron diffraction data from nanocrystals of NACore and PreNAC were 
collected using MicroED techniques*°. These nanocrystals typically clump 
together. To break up the clumps, an approximately 100 il volume of nanocrys- 
tals was placed in a sonication bath for 30 min. Nanocrystals were deposited onto 
a Quantifoil holey-carbon EM grid in a 2-3 pl drop after appropriate dilution, 
which optimized for crystal density on the grid. All grids were then blotted and 
vitrified by plunging into liquid ethane using a Vitrobot Mark IV (FEI), then 
transferring to liquid nitrogen for storage. Frozen hydrated grids were transferred 
to a cryo-TEM using a Gatan 626 cryo-holder. Diffraction patterns and crystal 
images were collected using an FEG-equipped FEI Tecnai F20 TEM operating at 
200 kV and recorded using a bottom mount TVIPS F416 CMOS camera with a 
sensor size of 4,096 X 4,096 pixels, each 15.6 X 15.6 ttm. Diffraction patterns were 
recorded by operating the detector in rolling shutter mode with 2 X 2 pixel bin- 
ning, producing a final image 2,048 X 2,048 pixels in size. Individual image 
frames were taken with exposure times of 3-4 per image, using a selected area 
aperture with an illuminating spot size of approximately 1 um. This geometry 
equates to an electron dose of less than 0.1 e~ per A? per second. During each 
exposure, crystals were continuously rotated within the beam at a rate of 0.3° per 
second, corresponding to 1.2° wedge per frame. Diffraction data were collected 
from several crystals each oriented differently with respect to the rotation axis. 
These data sets each spanned wedges of reciprocal space ranging from 40° to 80°. 

X-ray diffraction data from nanocrystals of NACore were collected using XFEL 
radiation at the CX] instrument (Coherent X-ray Imaging) at the Linear Coherent 
Light Source (LCLS)-SLAC. The photon energy of the X-ray pulses was 8.52 keV 
(1.45 A). Each 40 fs pulse contained up to 6 X 10"! photons at the sample posi- 
tion, taking into account a beam line transmission of 60%. The diameter of the 
beam was approximately 1 j1m. We used a concentration of approximately 25 ll 
of pelleted material suspended in 1 ml water. The sample was injected into the 
XFEL beam using a liquid jet injector and a gas dynamic virtual nozzle“. The 
micro jet width was approximately 4 um and the flow rate was 40 pl min '. The 
sample caused noticeable sputtering of the liquid jet. XFEL data were processed 
using cctbx.xfel*“*. 

Calibration of the sample to detector distance in MicroED was accomplished 
using a polycrystalline gold standard and by referencing the prominent reflec- 
tions in the electron diffraction experiment with the corresponding reflections in 
the XFEL data. Calibration of the x/y locations of the 64-tile CSPAD detector was 
performed by cctbx.xfel by refining the optically measured tile positions against a 
thermolysin data set*. 

To gain compatibility with conventional X-ray data processing programs, the 
MicroED diffraction images were converted from tiff or TVIPS format to the 
SMV crystallographic format. We used XDS to index the diffraction images”, and 
XSCALE for merging and scaling together data sets originating from different 
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crystals. For NACore, data from four crystals were merged, while for PreNAC, 
data from three crystals were merged to assemble the final data sets (see Extended 
Data Table 1). 

Structure determination. The molecular replacement solution for SubNACore 
was obtained using the program Phaser*’. The search model consisted of a geo- 
metrically ideal B-strand composed of nine alanine residues. Crystallographic 
refinements were performed with the program Refmac”. 

The molecular replacement solution for NACore was obtained using the program 
Phaser**. The search model consisted of the SubNACore structure determined prev- 
iously. Crystallographic refinements were performed with the program Phenix”? 
and Buster®’. 

The molecular replacement solution for PreNAC was obtained using the pro- 
gram Phaser**. The search model consisted of a geometrically ideal B-strand 
composed of six residues with sequence GVTTVA. Crystallographic refinements 
were performed with the program Phenix” and Refmac”. 

Model building for all segments was performed using COOT*. Data proces- 

sing and refinement statistics are reported in Extended Data Table 1. The coor- 
dinates of the final models and the structure factors have been deposited in the 
Protein Data Bank with PDB code 4RIK for SubNACore, 4RIL for NACore, and 
4ZNN for PreNAC. The structures were illustrated using Pymol®?. 
Protein expression and purification. The human wild-type o-synuclein con- 
struct has been previously characterized™ (pRK172, ampicillin, T7 promoter) 
with sequence: MDVFMKGLSKAKEGVVAAAEKTKQGVAEAAGKTKEGVL 
YVGSKTKEGVVHGVATVAEKTKEQVTNVGGAVVTGVTAVAQKTVEGAG 
SIAAATGFVKKDQLGKNEEGAPQEGILEDMPVDPDNEAYEMPSEEGYQDY 
EPEA. 

Full length «-synuclein was purified according to published protocols”. The 
a-synuclein construct was transformed into Escherichia coli expression cell line 
BL21 (DE3) gold (Agilent Technologies) for wild-type o-synuclein protein 
expression. A single colony was incubated into 100 m1 LB Miller broth (Fisher 
Scientific) supplemented with 100 jg ml ' ampicillin (Fisher Scientific) and grown 
overnight at 37 °C. One litre of LB (Miller) supplemented with 100 1g ml’ ampi- 
cillin in 2-1 shaker flasks was incubated with 10 ml of overnight culture and grown 
at 37 °C until the culture reached OD¢00 ~ 0.6-0.8 as measured by a BioPhotometer 
UV/VIS Photometer (Eppendorf). IPTG (Isopropyl B-p-1-thiogalactopyranoside) 
was added to a final concentration of 0.5 mM, and grown for 4-6 h at 30°C. Cells 
were harvested by centrifugation at 5,500g for 10 min at 4 °C. The cell pellet was 
frozen and stored at —80 °C. 

The cell pellet was thawed on ice and resuspended in lysis buffer (100 mM Tris- 
HCl pH 8.0, 500 mM NaCl, 1 mM EDTA pH 8.0) and lysed by sonication. Crude 
cell lysate was clarified by centrifugation at 15,000g for 30 min at 4°C. The 
clarified cell lysate was boiled and cell debris was removed by centrifugation. 
Protein in the supernatant was precipitated in acid at pH 3.5 through addition 
of HCl by titration to protein solution on ice while stirring then centrifuged for an 
additional 15,000g for 30 min at 4 °C. Supernatant was dialysed against buffer A 
(20mM Tris-HCl, pH 8.0). After dialysis the solution was filtered through a 
0.45 jum syringe (Corning) before loading onto a 20ml HiPrep Q HP 16/10 
column (GE Healthcare). The Q-HP column was washed with five column 
volumes of buffer A and protein eluted using a linear gradient to 100% in five 
column volumes of buffer B (20 mM Tris-HCl, 1 M NaCl, pH 8.0). Protein eluted 
at around 50-70% buffer B; peak fractions were pooled. Pooled samples were 
concentrated approximately tenfold using Amicon Ultra-15 centrifugal filters. 
Approximately 5 ml of the concentrated sample was loaded onto a HiPrep 26/60 
Sephacryl S-75 HR column equilibrated with filtration buffer (25 mM sodium 
phosphate, 100 mM NaCl, pH 7.5). Peak fractions were pooled from the gel 
filtration column and dialysed against 5 mM Tris-HCl, pH 7.5, concentrated to 
3mgml ’. These were filtered through a 0.2 1m pore size filter (Corning) and 
stored at 4°C. 

Recombinantly expressed full-length «-synuclein with an N-terminal acetyla- 
tion was prepared and purified in the following way based on a protocol detailed 
in ref. 16 The o-synuclein plasmid was co-expressed with a heterodimeric protein 
acetylation complex from Schizosaccharomyces pombe to acetylate the N ter- 
minus (pACYC-DUET, chloramphenicol, T7 promoter)**. The two vectors were 
co-transformed into E. coli BL21 (DE3) using media containing both ampicillin 
and chloramphenicol. Cell cultures were grown in TB media containing ampi- 
cillin and chloramphenicol and induced to express -synuclein with 0.5 mM 
IPTG overnight at 25 °C. Cells were harvested by centrifugation, the cell pellet 
then resuspended in lysis buffer (100mM Tris-HCl pH 8.0, 500mM NaCl, 
1mMEDTA pH 8.0, and 1mM phenylmethylsulfonyl fluoride) and cells lysed 
using an Emulsiflex homogenizer (Avestin). The lysate was boiled and debris 
removed by centrifugation. A protein fraction was also removed by precipitation 
at low pH on ice followed by centrifugation. The remaining supernatant was pH 
adjusted by titration and dialysed against buffer A (20mM Tris-HCl, pH 8.0, 
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1mMDTT, 1mMEDTA, pH 8.0). The resulting protein solution was loaded 
onto a 5ml Q-Sepharose FF column (GE Healthcare) equilibrated with buffer 
A and eluted against a linear gradient of buffer B (1 M NaCl, 20 mM Tris-HCl, pH 
8.0, 1mM DTT, 1mMEDTA, pH 8.0). Fractions containing «-synuclein were 
identified using SDS-PAGE, collected, concentrated and further purified by size 
exclusion (Sephacryl S-100 16/60, GE Healthcare) in 20mM Tris, pH 8.0, 
100 mM NaCl, 1mMDTT, 1 mM EDTA. Purity of fractions was assessed by 
SDS-PAGE. 

Acetylated protein was characterized by LC-MS*°°*. Expected average mass: 

14460.1 Da for #-synuclein and 14502.1 Da for acetylated «-synuclein. Observed 
average mass: 14464.0 Da for «-synuclein and 14506.0 Da for acetylated o-synu- 
clein (Extended Data Fig. 6). The shift of 4 Da between observed and expected 
average masses is due to instrumental error. 
Fibril formation and detection. Purified o-synuclein in 50 mM Tris, 150 mM KCl, 
pH 7.5 was shaken at a concentration of 500 [1M at 37 °C in a Torey Pine shaker. To 
form the fibrillar samples of SubNACore and NACore, lyophilized peptides were 
dissolved to a final concentration of 500 uM in 5mM lithium hydroxide, 20 mM 
sodium phosphate pH 7.5 and 0.1 M NaCl. All samples were shaken at 37°C in a 
Torey Pine shaker for 72 h. Freshly dissolved samples were prepared by dissolving 
lyophilized peptides immediately before addition to cells for assays. 

Turbidity measurements were used to compare NACore and SubNACore 
aggregation. Peptide samples were freshly dissolved to 1.6 mM in a sample buffer 
with 5mMLiOH and 1% DMSO and then filtered through a PVDF filter 
(Millipore, 0.1 j1m). Measurements were performed using a black NUNC 96 well 
plate with 200 pl of sample per well (3-4 replicates per sample). The plate was 
agitated at 37°C, with a 3mm rotation diameter, at 300 r.p.m. in a Varioskan 
microplate reader (Thermo). Absorbance readings were recorded every 3-15 min 
at 340 nm. 

Negative-stain transmission electron microscopy. Cytotoxicity samples were 
evaluated for presence of fibrils by electron microscopy. In brief, 5-ul samples 
were spotted directly on freshly glow-discharged carbon-coated electron micro- 
scopy grids (Ted Pella). After 4 min incubation, grids were rinsed twice with 5 ul 
distilled water and stained with 2% uranyl acetate for 1 min. Specimens were 
examined on an FEI] T12 electron microscope. 

Fibril diffraction. Fibrils formed from purified «-synuclein with and without 
N-terminal acetylation were concentrated by centrifugation, washed, and 
oriented while drying between two glass capillaries. Likewise, NACore nanocrys- 
tals were also concentrated, washed with nanopure water, and allowed to orient 
while drying between two glass capillaries. The glass capillaries holding the 
aligned fibrils or nanocrystals were mounted on a brass pin for diffraction at 
room temperature using 154A X-rays produced by a Rigaku FRE+ rotating 
anode generator equipped with an HTC imaging plate. All patterns were collected 
at a distance of 180mm and analysed using the Adxv software package*’”. A 
simulated pattern from the full length «-synuclein model presented in Fig. 3 
was obtained by calculating structure factors from the model using the sfall 
module from CCP4, assigning the model a unit cell of 200 x 4.74 x 200A. 
Cylindrical averaging of these structure factors about the fibre axis (y axis) dir- 
ection produced a set of simulated fibril diffraction intensities. 

Cytotoxicity assays. Adherent PC12 cells (ATCC CRL-1721) were cultured in 
ATCC-formulated RPMI-1640 medium (ATCC 30-2001) supplemented with 
10% horse serum and 5% fetal bovine serum and plated at 10,000 cells per well 
to a final volume of 90 pl. All MTT assays were performed with Cell Titer 96 
aqueous non-radioactive cell proliferation kit (MTT, Promega cat. no. 4100). 
Cells were cultured in 96-well plates for 20 h at 37 °C in 5% CO) before addition 
of samples (Costar cat. no. 3596). 10 ul of sample was added to each well contain- 
ing 90 pl of medium and incubated for 24h at 37 °C in 5% CO). Then, 15 pil dye 
solution (Promega cat. no. 4102) was added into each well, followed by incubation 
for 4h at 37°C in 5% CO . This was followed by the addition of 100 il solubil- 
ization Solution/Stop Mix (Promega cat. no. 4101) to each well. After 12 h incuba- 
tion at room temperature, the absorbance was measured at 570 nm. Background 
absorbance was recorded at 700 nm. The data was normalized with cells treated 
with 1% (w/v) SDS to 0% reduction, and cells treated with sample buffer to 100% 
reduction. 

Lactose dehydrogenase assays were done using CytoTox-ONE Homogeneous 
Membrane Integrity, (Promega, cat. no. G7890) as per manufacturer’s instructions. 


In brief, cells were plated in 96-well, black-wall, clear-bottom (Fisher cat. no. 07- 
200-588) tissue culture plates at 10,000 cells per well to a final volume of 90 ll. Cells 
were incubated for an additional 20h at 37°C in 5% CQ, before addition of 
samples. Next, 10 pl of sample was added to each well, following which the cells 
were incubated for another 24h. 100 pl of reagent was added to each well and 
incubated for 15 min at room temperature. The addition of 50 pl of stop solution 
stopped the reaction. Fluorescence was measured in a Spectramax M5 (Molecular 
Devices) using excitation and emission wavelengths of 560nm and 590nm, 
respectively. Data was normalized using cells treated with buffer as 0% release 
and 0.1% Triton X-100 as 100% release. 

Construction of a-synuclein A53T fibril model. A model for full-length «-synu- 
clein A53T mutant fibrils that are involved in the early onset of PD was con- 
structed using a section of the NACore crystal packing as a scaffold. Figure 3 
illustrates the four copies of the NACore segment used for the scaffold. The crystal 
structure of the two inner strands was adapted with minimal changes as the 
analogous segments 68-78. The structure of PreNAC was matched onto the weak 
interface of the NACore structure. Only 4 of the 11 side chains in the segment 46- 
56 differ from those in the NACore segment 68-78 and residues V51-V55 can be 
closely matched to V71-V75. Hence the model for both the homotypic interface 
and heterotypic interface in the full-length fibre model closely resemble those 
observed in the NACore structure. The regions outside these segments were 
adapted from the structure of the native o-synuclein fold, PDB ID: 2KKW*. 
These segments were spliced in manually using COOT. The models were energy 
minimized and temperature annealed using the program CNS” with hydrogen- 
bonding potential®. The simulated fibre diffraction pattern calculated from this 
model shows prominent reflections that agree with those observed in fibre dif- 
fraction patterns of NACore, o-synuclein, and N-acetyl «-synuclein (Extended 
Data Table 2). 
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Extended Data Figure 1 | A schematic representation of a-synuclein, 
highlighting the NAC region (residues 61-95) and within it the NACore 
sequence (residues 68-78). A series of bars span regions of «-synuclein that 
are of interest to this work. Among the three synuclein paralogues (a, B and y), 
the region whose sequence is unique to «-synuclein is shown as a blue bar 
(residues 72-83) that overlaps with a large portion of NACore. Segments 
investigated in ref. 23 are also shown. These span a variety of regions within 


NACore. Two of the segments we investigate here, SubNACore and NACore, 
are shown in this context. Only one of the segments studied ref. 23 is an exact 
match to our NACore sequence, and only this segment is both toxic and 
fibrillar. The sequences of «-synuclein, B-synuclein, and y-synuclein are 
shown as a reference with conserved residues in bold and the NACore sequence 
in red. 
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>a 
Extended Data Figure 2 | NACore difference density maps calculated after molecule near a threonine side chain (red circle); a second water was located 
successful molecular replacement using the SubNACore search model during the refinement process. The blue mesh represents 2F, — F. density 
clearly revealed the positions of the missing residues (positive F, — F. contoured at 1.20. The green and red mesh represent F, — F, densities 


density at N and C termini corresponding to G68 and A78) and one water —_ contoured at 3.0 and —3.0o, respectively. All maps were o4-weighted". 
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Extended Data Figure 3 | The crystal structure of NACore reveals pairs of 
sheets as in the spines of amyloid fibrils. a, NACore’s two types of sheet- 
sheet interfaces: a larger interface (orange, 268 A* of buried accessible surface 
area per chain) we call interface A, and a weaker interface (blue, 167 A?) we call 
interface B. The crystal is viewed along the hydrogen-bonding direction 
(crystal ‘b’ dimension). The red lines outline the unit cell. b, The van der Waals 
packing between sheets. The sheets are related by a 2; screw axis denoted in 
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black. The only gaps left by the interface are filled with water molecules which 
hydrogen-bond to the threonine residues (partially showing aqua spheres). 
The shape complementarity of both interfaces is 0.7. The viewing direction 
is the same as in a. c, Orthogonal view of the fibrillar assembly. The protofibril 
axis, coinciding with the 2, screw axis designated by the arrow, runs vertically 
between the pairs of sheets. 
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a ; NACore | 


Interface B 
Interface A 

1 2 3 4 5 6 7 8 9 All 
Amino Acid A Vv Vv T G Vv T A Vv - 


RMSD_res (A) 0.183 0.171 0.118 0.103 0.185 0.225 0.160 0.351 1.316 0.499 
RMSD_ca (A) 0.174 0.050 0.083 0.071 0.183 0.105 0.156 0.201 0.555 0.226 


b PreNAC 


NACore 


1 2 3 4 5 All 
Amino Acid G Vv T TIA Vv 
RMSD_ca(A) 0.919 0.214 0.503 0.319 3.199 1.515 


Extended Data Figure 4 | Comparison of the crystal packing for NACore RMSD_res is an all-atom comparison between residue pairs, while RMSD_ca 
and SubNACore. a, The face-to-face interactions are virtually the same forthe | compares only Cx pairs. b, PreNAC (blue) is compared with NACore (orange). 


pairs of NACore segments (orange chains) in its crystal structure and the Five residues from each strand are shown in darker colour and the r.m.s.d. 
SubNACore segments (white chains) in its structure (interfaces AandBshown __ values between their Cx pairs are compared in the table below. The PreNAC- 
in gold and blue, respectively). The table below shows the pairwise r.m.s.d. NACore interaction mimics the weaker interface B in the NACore structure. 


values comparing the nine residues shared in common between the structures. 
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Extended Data Figure 5 | Intense reflections common among the NACore 
and the two polymorphs of full length a-synuclein suggest common 
structural features. Common structural features are illustrated here on the 
crystal packing diagrams of NACore. The (0,0,2) planes approximate the 
separation between sheets in interface A (orange). The (0,2,0), (—1,1,1), and 
(1,1,]) reflections are intense because the corresponding Bragg planes 


recapitulate the staggering of strands from opposing sheets. The red lines 
correspond to the unit cell boundaries and all planes are shown in black. 

The location of the unit cell origin is indicated by “O’. The unit cell dimensions 
a, b, and c are labelled. Bragg spacings (spacings between planes), indicated by 
‘d’, are indicated in angstroms. 
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Extended Data Figure 6 | Mass spectrometry analysis of recombinantly the N-terminally acetylated form is appropriately shifted with respect to the 
expressed, full-length a-synuclein, with and without N-terminal acetylation. native form of the protein (14464.0 Da for o-synuclein and 14506.0 Da for 
The mass profile of wild-type full length o:-synuclein (left) is compared to acetylated o-synuclein), within a margin of error of 4 Da. 


that of an N-terminally acetylated form of the protein (right). The mass shift for 
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Segment SubNACore NACore PreNAC 
6es9AVVTGVTAV77_| ssGAVVTGVTAVAs6 | 47GVVHGVTTVAs6 
Data collection 
Radiation source Synchrotron Electron Electron 
Space group C2 C2 P21 
Cell dimensions 
a,b,c (A) 61.9, 4.80, 17.3 70.8, 4.82, 16.79 17.9, 4.7, 33.0 
a,B,y (°) 90, 104.1, 90 90, 105.7, 90 90, 94.3, 90 
Resolution (A 1.85 (1.95-1.85 1.43 (1.60-1.43 1.41 (1.56-1.41 
Wavelength (A 0.9791 0.0251 0.0251 
Rmerge 0.117 (0.282) 0.173 (0.560) 0.236 (0.535) 
Rriim. 0.135 (0.322) 0.199 (0.647) 0.264 (0.609) 
Rp.im. 0.065 (0.154) 0.093 (0.311) 0.185 (0.305) 
I/ol 5.2 (2.7) 5.5 (2.5) 4.6 (1.8) 
CC1i2 (%) 99.5 (97.8) 99.4 (92.3) 96.7(74.0) 
Completeness (%) 97.9 (98.3) 89.9 (82.6) 86.9 (69.6) 
Multiplicity 4.1 (4.0) 44 (4.3) 3.7 (3.5) 
Refinement 


Resolution (A) 


1.85 (2.07-1.85) 


1.43 (1.60-1.43) 


1.41 (1.41-1.57) 


No. reflections 


470 (125) 


1073 (245) 


1006 (239) 


Rwork 0.176 (0.248) 0.248 (0.253) 0.235 (0.336) 
Rtree 0.221 (0.286) 0.275 (0.331) 0.282 (0.329) 
CCwork 0.964 (0.896) 0.947(0.618) 0.937(0.335) 
CCrree 0.889 (0.993) 0.986(0.269) 0.967(0.361) 
No. atoms 

Protein 57 66 66 

Water 3 2 4 
B-factors (A*) 

Protein 17.1 9.0 16.1 

Water 27.6 2.7 24.6 
Wilson B (A 11.8 10.3 13.8 
R.m.s deviations 

Bond lengths (A 0.005 0.010 0.020 

Bond angles (°) 1.1 1.6 2.0 
PDB ID code 4RIK 4RIL 4ZNN 
EMDB ID code EMD-3028 EMD-3001 


*Highest resolution shell is shown in parenthesis. Data quality is indicated 


by the redundancy independent merging R factor (r.i.m) and the precision indicating merging R factor (p.i.m.). 
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Extended Data Table 1 | Statistics of data collection and atomic refinement for NACore, its fragment SubNACore, and PreNAC 
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Extended Data Table 2 | Comparison of reflections observed in powder diffraction of fibrils of full-length a-synuclein, N-acetyl a-synuclein, 
and a synthetic pattern calculated from our a-synuclein model, to aligned nanocrystals of NACore. 


Segment Reflections (A) 

NACore 2.21, 2.26, 2.39, 2.52, 2.61, 2.68, 2.78, 3.02, 3.12, 3.34, 3.56, 3.86, 
GAVVTGVTAVA 4.34, 4.57, 5.16, 5.98, 7.56, 8.19, 10.46, 11.63, 13.29, 16.61 

a-syn 2.39, 4.64, 6.82, 8.29, 10.06 


N-acetyl a-syn 2.38, 4.62, 8.18, 9.80, 11.90 
2.23, 2.25, 2.35, 3.29, 3.63, 3.70, 3.95, 4.08, 4.56, 4.68, 8.36, 8.69, 


Simulated a-syn 21.76, 24.47, 27.61, 31.67 


Bold reflections are strong and common to all three samples. Colours of the labelled reflections match those in Fig. 2. 
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Structure of mammalian eIF3 in the 
context of the 43S preinitiation complex 


Amedee des Georges', Vidya Dhote?, Lauriane Kuhn’, Christopher U. T. Hellen’, Tatyana V. Pestova’, 


Joachim Frank!* & Yaser Hashem? 


During eukaryotic translation initiation, 43S complexes, comprising a 40S ribosomal subunit, initiator transfer RNA and 
initiation factors (eIF) 2, 3, land 1A, attach to the 5’-terminal region of messenger RNA and scan along it to the initiation 
codon. Scanning on structured mRNAs also requires the DExH-box protein DHX29. Mammalian eIF3 contains 13 
subunits and participates in nearly all steps of translation initiation. Eight subunits having PCI (proteasome, COP9 
signalosome, elF3) or MPN (Mprl, Padl, amino-terminal) domains constitute the structural core of eIF3, to which 
five peripheral subunits are flexibly linked. Here we present a cryo-electron microscopy structure of elF3 in the 
context of the DHX29-bound 43S complex, showing the PCI/MPN core at ~6 A resolution. It reveals the organization 
of the individual subunits and their interactions with components of the 43S complex. We were able to build 
near-complete polyalanine-level models of the elF3 PCI/MPN core and of two peripheral subunits. The implications 
for understanding mRNA ribosomal attachment and scanning are discussed. 


Translation initiation in eukaryotes begins with binding of eIF3, eIF1, 
eIF1A and the elF2-GTP-Met-tRNA;" ternary complex to the 40S 
subunit, forming a 43S preinitiation complex’. The 43S complex 
attaches to the cap-proximal region of mRNA after unwinding of 
its secondary structure by eIF4A, eIF4B and eIF4F, and scans down- 
stream to the initiation codon, where it forms a 48S initiation complex 
by codon-anticodon base pairing. Scanning on structured mam- 
malian mRNAs also requires DHX29, which binds directly to the 
40S subunit. Finally, eIF5 and eIF5B promote joining of the 60S 
subunit to the 48S complex, yielding an elongation-competent 80S 
ribosome. 

eIF3 is the largest, most complex initiation factor, which interacts 
with several eIFs, including eIF1 and the eIF4G subunit of eIF4F (refs 
1-3). eIF3 is involved in almost all steps of initiation, including ribo- 
somal recruitment of the ternary complex, attachment of 43S complexes 
to mRNA via interaction with eIF4G, and scanning. The ~800 kilo- 
dalton (kDa) mammalian eIF3 comprises 13 subunits (a~m) (Extended 
Data Fig. 1a). Six subunits (a, c, e, k, 1 and m) contain PCI domains, 
which consist of N-terminal helical repeats followed by a winged helix 
domain (WHD) that mediates PCI polymerization’, and two subunits (f 
and h) contain MPN domains, which consist of a B-barrel surrounded 
by a-helices and B-strands that function to promote assembly of multi- 
protein complexes”*®. The PCI/MPN subunits form the octameric struc- 
tural core of eIF3. Cryo-electron microscopy (cryo-EM) studies’'® 
revealed the organization of the five-lobed PCI/MPN core of mam- 
malian eIF3, and confirmed the similarity of its topology with those of 
the proteasome lid and the COP9 signalosome*'"*. However, the reso- 
lution of eIF3 in these studies (12-20 A) was insufficient to reveal 
molecular details of the PCI/MPN core organization. 

Four of the remaining subunits (b, d, g and i) are stably linked to the 
PCI/MPN core, probably in a flexible manner”*"°. Domains in these 
subunits include RNA recognition motif (RRM) domains (eIF3b and 
eIF3g)'* and WD40 B-propeller domains (eIF3b and elF3i)'*”. 
eIF3b, eIF3i and eIF3g form a separate module, which attaches to 


the PCI/MPN core through its interaction with the elF3a carboxy- 
terminal domain (CTD)'*’. The last subunit, elF3j, is substoichio- 
metric and loosely attached to the rest of eIF3 (ref. 22). Whereas most 
eukaryotes encode a complete set of eIF3 subunits, Saccharomyces 
cerevisiae and related yeasts retain only six: two PCI (a and c) and 
four non-core (b, i, g and j) subunits, with eIF3j being non-essential”. 

We recently determined the structure of mammalian eIF3 and its 
position on the 40S subunit by cryo-EM reconstruction of the 
DHX29-bound 43S complex at 11.6 A resolution’. The PCI/MPN 
core resides on the back of the 40S subunit, making two contact points 
via its left arm and head with ribosomal proteins (rp) eS1/eS26 and 
uS15/eS27, respectively. Two additional densities, on the solvent side 
underneath helix 16 (h16) and on the head behind RACK1, were 
attributed to the peripheral domains of eIF3 belonging to non-core 
subunits. However, the resolution of this complex was insufficient for 
modelling of eIF3. Thus, although recent crystallographic studies 
revealed some important structural aspects of yeast eIF3 (ref. 17), 
molecular details of mammalian eIF3 organization remained obscure. 

Here we present a high-resolution cryo-EM reconstruction of 
mammalian eIF3 (lacking eIF3j) in the context of the DHX29-bound 
43S complex. The reconstructed density map allowed us to derive a 
near-complete polyalanine-level model of the eIF3 PCI/MPN octa- 
mer core and two peripheral subunits, using homology and ab initio 
modelling. 


Sample preparation and electron microscopy 


To obtain the structure of 40S-associated eI[F3, DHX29-bound 43S com- 
plexes were prepared as described’ (Supplementary Information). eIF3 
was purified from rabbit reticulocyte lysate and contained C-terminally 
truncated elF3a (2-1061; see Supplementary Information ‘Sample com- 
position and image processing’, Extended Data Fig. 2a and Extended 
Data Table 1a). 

The processed imaged particles yielded an ~6 A cryo-EM reconstruc- 
tion on average (Extended Data Figs 1b and 2d, f). Local refinement of the 
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orientations of the images was applied to improve the resolution of eIF3 
(Extended Data Fig. 2c, e) leading to a resolution of ~6 A for most 
regions of the eIF3 core. Compared to our previous study’, secondary 
structure elements are now clearly resolved in most regions of the core 
(Extended Data Fig. 1c-f). 


Structure and atomic model of eIF3 PCI/MPN core 


The PCI/MPN octamer core adopts the classic five-lobed shape. 
Density segmentation (Fig. la—c) was performed on the basis of topo- 
logical similarity with the proteasome lid and the COP9 signalo- 
some’””*, and rigid-body fitting of crystal structures of yeast elF3a 
and eIF3c (ref. 17). This analysis revealed the general organization of 
the eIF3 core (Fig. 1d). The PCI subunits (a, c, e, l, k and m) are 
arranged sequentially, forming an arc. The MPN subunits (f and h) 
bind to each other and attach to the rest of the octamer mainly 
through the association of eIF3f with eIF3m. In addition to the PCI 
arc, at least one o-helix of each subunit (except for eIF3a and eIF3m) is 
involved in the formation of a seven-helix bundle. 

Each subunit of the elF3 core was modelled in its density segment 
using ab initio and homology modelling. When there was no homology 
to rely on, the secondary structure elements (principally «-helices) of 
the concerned parts were predicted and built into the map. The result- 
ing model fits the eIF3 core density (Figs le-g and 2) with a cross- 
correlation coefficient of 0.94. Despite the good resolution of the eIF3 
core in the cryo-EM map, one region of density could not be assigned 
(Extended Data Fig. 1g). 

The model of the eIF3 core reveals that the PCI arc, the major 
interaction hub, consists of a large arched B-sheet assembled from 
the WHD f-sheets of the individual PCI domains of subunits a, c, e, |, 
kand m (Fig. 2c). This first interaction hub wraps around the second, 
which comprises a seven-helix bundle formed by packing of the 
C-terminal helices of subunits c, e, f, h, k and | (Fig. 2a, c, g). To assign 
helices to individual subunits, we used the conserved helical bundle 
structure in the proteasome lid and COP9 signalosome’*™, as the 


c elF3m = 
elF3a = 


elF3m 


Figure 1 | Structure of eIF3 core. a-c, Segmented eIF3 core, coloured variably 
by subunit, seen from different orientations. d, Two-dimensional 
representation of the three-dimensional structure of the eIF3 octamer core. The 
helical bundle is represented by coloured bars. Zig-zagged line on elF3c 
indicates possibly unstructured N-terminal tail. The eIF3a C-terminal region 
(not present in the structure reported here) is not shown in this schematic 
representation. e-g, Fitting of the eIF3 core model in its cryo-EM segmented 
density. 
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Figure 2 | Model of the eIF3 core. a, g, Close-up view of the seven-helix 
bundle formed by subunits h, c, e, | and k, seen in two orientations. 

b-f, Polyalanine-level model of the eIF3 octamer core seen from different 
orientations. c, Close-up view of B-sheet arc of the PCI domains of subunits m, 
a, c, e, k and 1. The seven-helix bundle was cut out by fading it to highlight 
the arched B-sheet. h, i, Close-up views of additional quaternary contacts 
between eIF3a and eIF3c (h) and eIF3c and elF3e (i). 


organization of the seven-helix bundle is conserved in eIF3 (Extended 
Data Fig. 3a-c). 

A comparison of our model of the PCI/MPN core of eIF3 with 
those of the proteasome lid and the COP9 signalosome (Extended 
Data Fig. 3d-i) showed the existence of additional quaternary inter- 
actions between eIF3a and eIF3c, and between eIF3c and eIF3e 
(Fig. 2h, i and Extended Data Fig. 3j, k). e[F3a interacts through an 
insert between helices 5 and 6 with eIF3c in a cavity formed by helix 6, 
the coil between helices 7 and 8 and the coil between helices 14 and 15. 
The eIF3c-elF3e interaction involves helix 11 of eIF3c and the 
N-terminal tail of e[F3e. These interactions rigidify the assembly of 
eIF3a, eIF3c and eIF3e, and the rotation of their helical-repeat 
domains with respect to their WHDs (Extended Data Fig. 31). 
Notably, the insert in mammalian eIF3a, which participates in the 
quaternary interaction, is lacking in yeast e[F3a. Mammalian eIF3a 
also possesses a more complex C-terminal helix following the PCI 
domain, which is involved in additional interactions with other core 
subunits (Figs 1d and 2). 

To validate our model, amino acid conservation at the interfaces of 
core subunits was assessed among five multicellular (Homo sapiens, 
Caenorhabditis elegans, Xenopus tropicalis, Arabidopsis thaliana and 
Drosophila melanogaster) and one unicellular (Neurospora crassa) 
organisms containing eIF3 with similar subunit composition 
(Supplementary Information ‘Validation of eIF3 core model’ and 
Extended Data Figs 4-6). Importantly, the a-c and c-e interfaces 
involving the WHDs are conserved in multicellular organisms ana- 
lysed, but not in N. crassa. However, the additional quaternary inter- 
actions between these subunits are conserved even in N. crassa, 
suggesting their importance in all eukaryotes that retain the octameric 
composition of the eIF3 core. The interfaces between the remaining 
subunits are conserved among all analysed organisms with few excep- 
tions in N. crassa. 

Our current model exhibits large discrepancies with a recent 
model’’ based on low-resolution data concerning the structure and 
conformation of various core subunits (that is, differences in the 
conformations of eIF3f, elF3h and eIF3e, as well as in the structures 
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of the C-terminal o-helices of eIF3f and eIF3h, the assignment of 
which was inverted in that model, thereby altering the structure of 
the seven-helix bundle) (Extended Data Fig. 7). 


Ribosomal contacts of the PCI/MPN core 


To identify residues in eIF3 core that directly contact components of 
the 43S complex, the atomic model of the human 40S subunit”, the 
crystal structure of the archaeal el[F2- ternary complex’®, and the 
homology model of the C-terminal two-thirds of DHX29 (ref. 27) 
were fitted into the density (Methods). Direct interactions of eIF3a 
and eIF3c with the 40S subunit were observed. eIF3a contacts rpeS1 
and, marginally, rpeS26 (Fig. 3a, pink and yellow arrowheads), but 
also the apical loop of ES7° (Fig. 3, top grey arrowhead). eIF3c inter- 
acts with rpeS27 and, marginally, with rpuS15 (Fig. 3, green and cyan 
arrowheads), but also with the apical loop of ES7° (Fig. 3, bottom grey 
arrowhead) and h22 (Fig. 3, red arrowhead). A detailed list of inter- 
actions is presented in Extended Data Table 1b. They are consistent 
with those observed for S. cerevisiae elF3a and eIF3c (ref. 17), with the 
exception of the interaction between the C terminus of yeast eIF3a and 
rpuS2 (refs 28-30), which is absent in the mammalian complex. 
Furthermore, the yeast e[F3a—eIF3c core is rotated by ~24° away 
from the platform and towards the solvent side in contrast to mam- 
malian eIF3 (ref. 28). 


Peripheral subunits of elF3 


To improve the resolution in peripheral domains of eIF3, focused 
classification was applied to the regions where peripheral subunits 
are located (Extended Data Fig. 9a, b), yielding reconstructions with 
local resolutions of ~9 A and ~10 A for regions of eIF3 near h16 and 
on the head next to RACK1, respectively. 

Segmentation of density corresponding to the peripheral domains of 
eIF3 near h16 revealed a nine-bladed f-propeller structure (Fig. 4a), 
unambiguously attributed to the WD40 domain of eIF3b (refs 16, 17). 
Density projecting from the WD40 domain towards the beak of the 40S 
subunit, in front of h16, corresponds to the eIF3b RRM domain. We 
have derived a homology model of residues 73-685 of rabbit elF3b 
(~85%) (Fig. 4b). In the 43S complex, eIF3b binds through its 
C-terminal tail, after the WD40 domain (residues 626-631) to the 
40S subunit at the tip of helix A (hA) of ES6° (Fig. 4a, right, red 
arrowhead), and interacts with rpuS4 and DHX29 over a large surface 
(Fig. 4c, slate grey and green arrowheads) (Extended Data Table 1b). 
The position of eIF3b in mammalian DHX29-bound 43S complexes’ is 
similar to that in yeast 40S-eIF3-eIF1-eIF1A complexes”, despite the 
absence of DHX29 in the latter. 

The low-resolution density extending from the eIF3b WD40 
domain away from the 40S subunit (Fig. 4a, dashed oval) was attrib- 
uted to elF3i, on the basis of crystal structures of the latter with the 
C-terminal helix of yeast eIF3b (ref. 15) and of the yeast elF3b-CTD- 
eIF3i-eIF3g-NTD complex”, and the size and shape compatibility 
with our segmented density. In the 43S complex, eIF3i is seen to 


Figure 3 | Contacts of the eIF3 core with the 40S subunit in the 43S complex. 
Solvent-side view of eIF3 bound to the DHX29-bound 43S complex. In black 
rectangle, close-up view of 40S-eIF3 core contacts. Coloured arrowheads 
indicate the interaction of the eIF3 core with ribosomal proteins and RNA. 
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Figure 4 | Peripheral subunits of e[F3. a, Segmented cryo-EM reconstruction 
focused on elF3 peripheral subunits localized near the mRNA channel entry, 
below DHX239, seen from the solvent face (left) and from below (middle). 
Close-up view of elF3 peripheral subunits seen from below (right), with 

a red arrowhead indicating the connection with ES6°-hA and numbers 
indicating the blades of the B-propeller structure. b, Atomic model of rabbit 
elF3b (orange), yeast elF3i (purple) and a long o-helix (red) corresponding to a 
fragment of the C-terminal helical region (‘eIF3a C-ter’). c, Left, front 

view of eIF3b and elF3i subunits bound to the 40S subunit and DHX29 in the 
context of the 43S complex. The remaining, unmodelled one-third of DHX29 is 
denoted as a transparent green surface, based on its cryo-EM density. Right, 
close-up view of the contact points between eIF3b and DHX29. The panel also 
shows the interaction of elF3i with DHX29. d, Left, segmented cryo-EM 
reconstruction focused on an elF3 peripheral subunit tentatively identified as 
elF3d, localized on the 40S head behind ribosomal protein RACK1. Middle, 
the same reconstruction seen from above, rendered in transparent with the 
atomic model of the human 40S subunit fitted in. Right, close-up view of the 
putative eIF3d subunit. Red arrow indicates a density bridging the globular 
domain of eIF3d to a density corresponding to the eIF2x-D1 domain, part of 
the elF2-GTP-Met-tRNA;M ternary-complex (TC) (density coloured in 
gold). Coloured arrowheads indicate the interaction of eIF3 peripheral subunits 
with various ribosomal proteins. 


interact only with DHX29 (Fig. 4c, purple arrow). No density could 
be attributed to the eI[F3g component of the eIF3g—eIF3b—eIF3i mod- 
ule because of the low local resolution in that region, probably due to 
the flexibility of this assembly. 

The residues forming the long curved helix that interacts with 
eIF3b at its N and C termini (Fig. 4b, red helix) could not be deter- 
mined with certainty because of its discontinuity with the elF3 core. 
However, on the basis of its curvature and the angle it forms with the 
core, it seems to extend from the C-terminal helix of eIF3a. This 
attribution is consistent with observations that elF3a, eIF3b, eIF3i 
and eIF3g form a stable complex’*'””', and with the existence of a 
spectrin domain at the e[F3a C-terminal end, which mediates inter- 
actions with eIF3b and eIF3i (ref. 19). 

The second peripheral subunit-binding site, behind RACK1, dis- 
plays a globular mass with a local resolution of ~10 A and no resolved 
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secondary structure features (Fig. 4d and Extended Data Fig. 9a), 
which adopts a flat triangular shape at an increased density threshold. 
This mass was attributed to eIF3d, consistent with the latter’s cross- 
linking with positions —8 to —17 on mRNA” and because the only 
other unassigned subunit, eIF3g, constitutes part of the e[F3b-eIF3i- 
eIF3g module. According to our structure, eIF3d directly contacts 
RACK], rpeS28, rpuS7 and rpuS9 (Fig. 4d, right, and Extended 
Data Fig. 9b). Notably, it also appears to interact with the D1 domain 
of eIF2a, as seen at a lower density threshold (Fig. 4d, right, red 
arrow). 


Discussion 


Our reconstruction of mammalian eIF3 in the context of the DHX29- 
bound 43S complex enabled us to build a near-complete polyalanine- 
level model of the eIF3 octameric PCI/MPN core, and to determine 
the ribosomal positions of 11 of its 12 stoichiometrically associated 
subunits (Fig. 5). eIF3 forms three regions of density corresponding to 
the PCI/MPN core, the el[F3b-elF3i-eIF3g module, and eIF3d. The 
PCI/MPN core binds at the solvent side of the 40S subunit opposite to 
the platform, with eIF3a and eIF3c establishing two contact points. 
The eIF3b-elF3i-eIF3g module resides at the mRNA entrance (with 
elF3b interacting directly with the 40S subunit), and is connected to 
the core by the elF3a-CTD. eIF3d is located near the mRNA exit, 
behind RACK1. 

The model of the PCI/MPN core reveals its assembly and intersubu- 
nit interactions in mammalian eIF3 (Figs 1 and 2), which are probably 
conserved in other organisms possessing eIF3 with a similar octameric 
core. This model is consistent with reports concerning the interactions 
of individual subunits and the existence of their stable sub-com- 
plexes*?'®'5!7~1, and with the finding that the functional core of mam- 
malian eIF3 comprises one non-PCI/MPN subunit (b) and five PCI/ 
MPN subunits (a, c, e, fand h)”". Thus, whereas eIF3a, eIF3c and eIF3e 
are the central constituents of the PCI arc, the PCI subunits k, 1 and m 
reside at its extremities (Fig. 1b, d). Consistently, e[F3k and eIF3] are 
easily displaced from eIF3 (ref. 18) and have been lost from the genome 
of some species”. e[F3m, which is not encoded in Trypanosoma brucei 
and Leishmania*’, can also be absent without major perturbation to the 
PCI arc. Importantly, the structure and arrangement of the ‘essential’ 
PCI subunits (a, c and e) are further rigidified by the quaternary a—c and 
c-e interactions, inexistent between their counterparts in the protea- 
some lid and COP9 signalosome. 

The large distance separating individual parts of eI[F3 on the 40S 
subunit points to specialization of their functions. The main role of 
the PCI/MPN subunits seems to be scaffolding. It involves direct 
interaction with the 40S subunit, eIF1 (ref. 28) and eIF4G (refs 34, 
35), and configuration of the non-core subunits within the same 
complex, while allowing them to interact with distant ribosomal 
regions. Interaction with eIF4G is essential and is thought to bridge 
the cap-binding complex, coupling its helicase activity with 43S com- 
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Figure 5 | Schematic representation of the arrangement of initiation factors 
and their subunits in the DHX29-bound 43S complex. a-c, Figure 
includes only eIFs and subunits for which the structures are known. The 40S 
subunit is depicted in grey surface; all other factors and subunits are labelled 
and coloured variably. elF3 helical bundles fortifying the intersubunit 
interactions are represented as cylinders. The 43S complex is shown from the 
back (a), solvent side (b) and front (c). 
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plexes during attachment to mRNA and scanning. Initially, e[F4G 
was shown to interact with eIF3e (ref. 34), but recent analysis revealed 
a second, adjacent site on eIF4G that binds to eIF3c and eIF3d (ref. 
35). This suggests that the binding site for eIF4G is located in a region 
of eIF3 where the c, d and e subunits come into proximity. Although 
in our density map eIF3d is separated from the PCI/MPN core, at least 
part of the unassigned density at the core (Extended Data Fig. 1g, red 
density) could belong to eIF3d. Importantly, regarding the position of 
eIF3e in 43S complexes, its interaction with eIF4G is also consistent 
with hydroxyl radical cleavage in ES6° from the eIF4G middle 
domain*. Taken together, the structure of the eIF3 core and biochem- 
ical data on the eIF3-eIF4G interaction allow us to suggest cautiously 
that eIF4F could reside on the solvent side of the 43S complex, per- 
haps between the PCI/MPN core and the eIF3b-eIF3i-eIF3g module. 

Regarding the function of the e[F3b-eIF3i-eIF3g module, various 
studies implicated its subunits in scanning. Thus, elF3b was suggested 
to be involved in start-codon selection’””*, eIF3g in reinitiation by 
recycled 40S subunits”, while eIF3i and eIF3g were suggested to 
stimulate scanning’. The elF3b-eIF3i-elF3g module is situated at 
the mRNA channel entrance such that the mRNA would probably 
interact directly with module before entering its binding channel. 
Thus, these factors could form an extension of the entrance portion 
of the mRNA channel, facilitating mRNA entry and maintaining its 
position during scanning. eIF3b could also influence the conforma- 
tion of h16, as it binds at its base. It is thus possible that one role of the 
eIF3b-eIF3i-eIF3g module is to maintain the mRNA entrance in a 
specific conformation. eIF3b also has a large interacting surface with 
DHX29. DHX29 most likely stimulates scanning indirectly, by indu- 
cing conformational changes in 43S complexes’”*. Interaction with 
eIF3b could therefore establish proper ribosomal positioning of 
DHX29 and possibly also transmit to the 40S subunit conformational 
changes that are induced in DHX29 upon NTP hydrolysis. 

As for eIF3d, in addition to interaction with eIF4G (ref. 35), its 
location near the mRNA exit and cross-linking to mRNA”’ suggest 
that it might also participate in extending the exit portion of the 
mRNA channel. 

In conclusion, our model provides a detailed view of the structure 
of mammalian eIF3 and of its interactions with components of the 
DHX29-bound 43S complex. It will serve as a framework for further 
elucidation of individual steps in eukaryotic translation initiation, 
which is crucial to a better understanding of translational control. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. 

Purification of 40S ribosomal subunits, initiation factors, DHX29 and ami- 
noacylation of tRNA;““. The exact protocols have been described previously’. 
Assembly and purification of 43S-DHX29 preinitiation complexes. The exact 
protocols have been described previously’. 

Electron microscopy. Three microlitres of each sample was applied to holey car- 
bon grids (carbon-coated Quantifoil 2/4 grid, Quantifoil Micro Tools GmbH) 
containing an additional continuous thin layer of carbon"! without previous plasma 
cleaning. Grids were blotted and vitrified by rapidly plunging into liquid ethane at 
—180°C with a Vitrobot Mark IV (FEI). Data acquisition was done under low- 
dose conditions (25 e Ay8 ona FEI Tecnai F30 Polara (FEI) operating at 300 kV. 
The data set was collected with the automated data collection system Leginon“ at a 
calibrated magnification of X30,120 on a Gatan K2 Summit direct electron detec- 
tion camera with a pixel size of 1.66 A on the object scale. A total of 11,274 micro- 
graphs were recorded in electron-counting, dose-fractionation mode as a series of 
20 frames, with an accumulation time of 0.4 per frame. The dose rate was set to 8 
counts per physical pixel per s (~10e~ pixel‘ s~'). The total exposure time was 8 s, 
leading to a total accumulated dose of approximately 25 e~ A~? on the specimen. 
Image processing. Dose-fractionated image stacks were aligned using dosefg- 
pu_driftcorr* and the sum of frames 3-20 in each image stack was used for 
further processing. The first two frames were omitted as they contained motions 
greater than 4A per frame on average. The data were then preprocessed using 
pySPIDER and Arachnid (http://www.arachnid.us)’. Arachnid is a Python- 
encapsulated version of SPIDER" replacing SPIDER batch files with Python 
scripts. It also contains novel procedures such as Autopicker*’, which was used 
for fully automated particle selection. Approximately 652,000 particles were 
extracted from the data set. These particles were classified with RELION™, yield- 
ing two high-quality classes containing 272,252 particles, or ~42% of the data set. 
The remaining particles were rejected (Extended Data Fig. 2). One class (19% of 
the particles) displays a 40S ribosomal subunit with eIF3 and the e[F2-Met- 
tRNAM* ternary complex bound, while the other (23% of the particles) displays 
a bare 40S ribosomal subunit. The class containing eIF3 was further classified, and 
all classes showing eIF3 with a density of good quality were pooled together 
(84,850 particles) and refined to a resolution of 6.2 A (Extended Data Fig. 1b). 
Reported resolutions were calculated with the FSC = 0.143 criterion” using the 
‘gold standard’ protocol, ensuring independence of half-set reconstructions. The 
reported resolution measurements were corrected for the effects of a soft mask on 
the FSC curve using high-resolution noise substitution”. Before visualization, all 
density maps were sharpened by applying a negative B-factor that was estimated 
using automated procedures”’. The local resolution, as measured using the pro- 
gram ResMap™, varies between 5 and 12 A across most of the map (Extended 
Data Fig. 2d, f). Particles were then realigned to the eIF3 core asa reference using a 
soft mask around eIF3 core subunits. This procedure improved the resolution of 
the more distal regions of eIF3, and resulted in a map of eIF3 core subunits at local 
resolutions estimated by ResMap to be between 6 A for most of elF3 core and up 
to 8 A (Extended Data Fig. 2c, e). These two maps were used for modelling the 
eIF3 core subunits. Next, focused classification was performed to isolate elF3 
peripheral subunits in two locations: in the vicinity of DHX29 and at the back 
of the ribosome head, close to RACK1. In the focused classification, we used 
masks with smooth edges created by UCSF Chimera and SEGGER™, and 
encompassing large regions around the regions of interest (Extended Data Fig. 
8). To avoid diverging orientation assignments during the Relion classification, 
owing to the small volume encompassed by the mask compared to the total 
volume, particle orientation assignments were kept fixed. The classification with 
a mask in the region of DHX29 gave one major class (51% of the particles) with 
well-defined eIF3 peripheral subunits in the vicinity of DHX29 (Extended Data 
Fig. 8a), which was subsequently refined to a resolution of 7.1 A (Extended Data 
Fig. 1b). The classification with a mask around the back of the 40S head gave one 
class (18% of the particles) with a strong density in this region, in contact with 
RACK1 (Extended Data Fig. 8b). Further classification of particles in this class 
reduced the conformational variability of the density and isolated one major 
conformer (39% of the remaining particles). This class of particles, a total of 
10,062 particles, yielded a reconstruction at 7.7 A (Extended Data Fig. 1b). 
Identification and annotation of e[F3 domains. The different domains of the 
elF3 constituent subunits were annotated (Extended Data Fig. 1a) on the basis 
of our eIF3 polyalanine-level model where possible. Thus, the coordinates of 
the WD40, RRM and PCI domains were annotated based on our model. 
Alternatively, eI[F3 domains were detected mainly using the conserved domain 
detector (CDD) tool’, which produces a graphical display of conserved domains 
identified in the protein using RPS-BLAST, and the InterPro protein sequence 
analysis and classification online tool’. The annotation relies on the top-scoring 


hits by default and represents a good approximation of the domain coordinates. 
The sequences of rabbit eIF3 subunits used for domain detection had the follow- 
ing NCBI accession numbers: gi|655868267 for elF3a, gi|291390868 for eIF3c, 
gil655763812 for eIF3d, gi|291388413 for eIF3e, gil655602713 for elF3f, 
gil655743367 for elF3g, gil291388434 for elF3h, gil75070231 for elF3i, gil 
655862979 for eIF3j, gil291390078 for eIF3k, gil291414663 for eIF3l and 
gi/291384781 for eIF3m. Note that the human eIF3b sequence (gi|83367072) 
was used rather than the rabbit sequence, because our nanoscale liquid chro- 
matography coupled to tandem mass spectrometry (nano-LC-MS/MS) analysis 
(Extended Data Fig. 9c) raised doubts about the length of the N-terminal region of 
the latter shown in currently available database entries. In addition to domains 
identified using CDD (Extended Data Fig. 1a), a spectrin domain was added to 
eIF3a (ref. 19) and a zinc-binding domain was added to eIF3g (ref. 56). 

In-gel protein digestion for nanoLC-MS/MS analysis. The protocol has been 
described previously”. In brief, SDS-PAGE bands were excised from the gel and 
transferred into 96-well microtitration plates. The bands were then destained and 
dehydrated. Gel pieces were washed again with the destaining solutions and then 
incubated with trypsin (Promega) and chymotrypsin (Promega) for digestion 
overnight at room temperature. The resulting peptides were extracted from the 
gel pieces. The initial digestion and extraction supernatants were pooled together 
and vacuum-dried in a SpeedVac concentrator. 
Nano-LC-electrospray-ionization triple-eTOF MS/MS analysis. Dried tryptic 
digests were resuspended in 12 ull of water containing 0.1% formic acid (solvent A) 
before analysis on a NanoLC-2DPlus system (with nanoFlex ChiP module; Eksigent, 
Sciex Separations) coupled to a Triple TOF 5600 mass spectrometer (AB Sciex) oper- 
ating in positive mode. Five microlitres of each sample was loaded on a ChIP C-18 
precolumn (300 ym ID X 5mm ChromXP; Eksigent) at 2 ul min”! in solvent A. 
After 10 min of desalting and concentration in the trap, the system was switched 
online with the analytical ChIP C-18 analytical column (75pmID X 15cm 
ChromXP; Eksigent) equilibrated in 95% solvent A and 5% solvent B (0.1% formic 
acid in acetonitrile). Peptides were eluted by using a 5-40% gradient of solvent B for 
60 min at a flow rate of 300 nl min~!. The TripleTOF 5600 was operated in data- 
dependant acquisition mode with Analyst software (version 1.5, ABSciex). Survey MS 
scans were acquired during 250 ms in the 400-1,250 m/z range. Up to 20 of the most 
intense multiply charged ions (2+ to 5+) were selected for CID (collision-induced 
dissociation) fragmentation, if they exceeded the 150 counts per second intensity 
threshold. Ions were fragmented using a rolling collision energy within a 60 ms 
accumulation time and an exclusion time of 15s. This ‘top20’ method, with a con- 
stant cycle time of 1.5 s, was set in high-sensitivity mode. 

To obtain optimal mass accuracy, a B-galactosidase digest (AB Sciex) was 
injected before each sample using the “Autocal’ feature from Analyst software: 
calibration was performed using the 10 more abundant peptides in MS mode and 
with the 729.3652 m/z precursor in MS/MS mode. Moreover, to prevent carry- 
over due to stationary phase memory, two consecutive washing runs were per- 
formed after each sample injection, as well as a blank injection (solvent A) to 
verify that no peptides were identified due to a carry-over phenomenon. 
Database search and data analysis. The protocol has been described previously”. 
Data were searched against the complete human and rabbit proteomes set from the 
SwissProt database. The algorithm used for database search was Mascot” (version 
2.2, Matrix Science) through the ProteinScape package” (v3.1, Bruker). Peptide 
modifications allowed during the search were: N-acetyl (protein), carbamido- 
methylation (C) and oxidation (M). Mass tolerances in MS and MS/MS were set 
to 30 p.p.m. and 0.5 Da, respectively, and the instrument setting was specified as 
ESI-QUAD-TOF. Three trypsin or chymotrypsin missed cleavages were allowed. A 
decoy database strategy®° was used to validate Mascot identifications at FDR <1% 
(individual identity scores varied between 30 and 34 for each data search) using the 
ProteinScape Protein Assessment tool. Data were further validated by manually 
inspecting the quality of the MS/MS fragmentation spectra: a minimum of five 
consecutive amino acids was requested to validate the spectrum, as well as other 
rules (proline-specific fragmentation pattern, major peaks assigned to fragments, 
and so on). To compare the position of the validated peptides on the full-length 
sequences of the different proteins, the Protein Sequence viewer from ProteinScape 
package was used. Moreover, to obtain maximal sequence coverage on the iden- 
tified proteins, a second round of database search was set up without defining 
enzyme specificity: semi-tryptic or non-tryptic peptides were thus highlighted in 
the trypsin digest analysis, as for the SDS-PAGE bands digested by chymotrypsin. 
Segmentation and display of density maps. Cryo-EM reconstructions were 
segmented using the SEGGER module** implemented in UCSF Chimera’. 
Segments counting less than 10,000 voxels were discarded. Segments were refined 
manually using the VOLUME ERASER module implemented in UCSF Chimera. 
Finally, the segments obtained were smoothed using a Gaussian filter in the 
VOLUME FILTER module also implemented in Chimera. The final maps were 
displayed and rendered with Chimera. 
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Atomic modelling of eIF3. We were able to model residues 7-605 of eIF3a 
(~43%), 320-876 of eIF3c (~61%), 4-422 of eIF3e (~94%), 93-364 of elF3f 
(~75%), 29-352 of elF3h (~92%), 2-216 of elF3k (~98%), 181-552 of elF31 
(~66%) and 7-370 of eIF3m (~97%). To generate an atomic model of eIF3, each 
subunit was separately modelled into its corresponding density segment. We should 
stress that our cryo-EM map does not allow atomic modelling of side-chains and 
exact residue registration, and therefore our model of eIF3 should be considered as a 
polyalanine-level model. eIF3a was modelled by homology to crystal structures of S. 
cerevisiae eIF3a (refs 17, 61). The additional long helix in the C-terminal tail was 
modelled ab initio by first predicting its secondary structure using the SYMPRED 
web service tool (http://www. ibi.vu.nl/programs/sympredwww/), and then generat- 
ing the predicted long helix in SWISS-MODEL™. The 3D structure generated was 
then fitted into its cryo-EM density in UCSF Chimera” by manually modifying the 
backbone torsion angles using the ADJUST TORSION module implemented in 
Chimera. The structure of the whole subunit was then refitted using molecular 
dynamics flexible fitting as described in the next section. 

Similarly, eIF3c was modelled by homology to the crystal structure of its S. 
cerevisiae counterpart’. The variant parts of its structure along with the helix in 
the C terminus tail were modelled as described above for eIF3a. Rabbit eIF3k was 
modelled entirely by homology to the crystal structure of human eIF3k*. The X-ray 
structures of the human COP9 subunits CSN1, CSN5 and CSN6 (ref. 13) were used 
as templates for modelling rabbit eIF3e, eIF3h and elF3f; the variant parts of the 
latter subunits, compared to their modelling templates, were modelled according to 
the method described above for eIF3a, regarding the modelling of the C-terminal 
helical tails. Modelling of the remaining elF3m and eIF3] subunits was based mainly 
on the atomic model of the proteasome lid subunits RPN9 and RPN3 (ref. 11) owing 
to their closely related topology, as indicated by our cryo-EM reconstruction, and 
variant parts were modelled ab initio according to the method described for eIF3a. 
For most parts that were modelled ab initio, the secondary structure elements 
predicted and validated by the cryo-EM reconstruction map were helical. 

As for eIF3 peripheral subunits, based on our density map, the crystal structures 

of the eIF3b WD40 domain from Chaetomium thermophilum'’, the C-terminal 
helical domain of S. cerevisiae eIF3b in complex with eIF3i (seven-bladed B-pro- 
peller)'* and the solution NMR structure of the RRM domain of the N-terminal 
end of human eIF3b (ref. 14), were used to derive a homology model of nearly the 
full eIF3b from rabbit (residues 73-685, ~85%, using the rabbit eIF3b sequence’®, 
gil291415469). This fragment corresponds to residues 170-782 of human eIF3b: 
for further discussion of the eI[F3b sequence see Extended Data Fig. 9c. Yeast eIF3i 
crystal structure’ was rigid-body fitted using the FIT IN MAP module implemen- 
ted in Chimera”. We did not attempt to generate a homology model of rabbit eIF3i 
due to the low local resolution of the cryo-EM reconstruction in that region and 
simply fitted the crystal structure of yeast elF3i as a rigid body. 
Fitting of atomic structures into electron microscopy maps. After the construc- 
tion of the eIF3 polyalanine-level model, it was placed into its cryo-EM density 
map along with the atomic model of the 40S subunit of the human ribosome” by 
rigid-body fitting using Chimera as described above. Remaining parts of the 43S 
cryo-EM density map corresponding to the eIF2-ternary complex, eIF3b, eIF3i 
and DHX29 were simply segmented out using the VOLUME ERASER module 
implemented in UCSF Chimera. Starting from this system, everything was flex- 
ibly fitted into the map using MDFF (molecular dynamics flexible fitting) as 
described in previous work”. In brief, the initial system was prepared for MDFF 
using VMD® and the trajectories run in NAMD**’. To achieve a better repres- 
entation of the inter- and intra-molecular interactions, the system was embedded 
in a solvent box of TIP3P water molecules, with an extra 12 A padding in each 
direction, and neutralized by potassium ions, and an excess of ~0.2 M KCl was 
added. The simulated system was prepared using CHARMM force field para- 
meters (Combined CHARMM All-Hydrogen Topology File for CHARMM22 
Proteins and CHARMM27 Lipids)”. The trajectories were run in explicit solv- 
ent. The run was stopped at 600 ps of simulation time. 
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Extended Data Figure 1 | Domain organization of the eIF3 subunits and 
resolution of eIF3 core. a, Schematic representation of the domain 
organization of the rabbit eIF3 subunits (see Methods). Domain boundaries are 
indicated, and based where possible on our polyalanine-level model of eIF3. 
HD, helical domain; Z, zinc-recognition motif. Dashed line in eIF3] subunit 
diagram indicates that the helical domain might extend further in the N 
terminus, but it was not possible to be conclusive based on our density map. 
b, Fourier shell correlation (FSC) of the different 43S complex reconstructions 
used during analysis. The resolution estimation followed the ‘gold standard’ 
protocol ensuring independence of the half-set reconstructions. x axis, 
resolution in A; y axis, FSC. Green line denotes 43S complexes including 
particles that present the eIF3 core subunits; blue line denotes 43S complexes 
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including all particles presenting DHX29 and the eIF3 peripheral subunits b, g 
and i; and red line denotes 43S complexes including all particles presenting the 
density attributed to eI[F3d. For each reconstruction, a dashed line marks the 
resolution at which the FSC reaches the value of 0.143. c-f, Qualitative 
comparison of eIF3 core resolution in the present and in previous structures. 
c, d, eIF3 core cryo-EM structure from our previous study at 11.6 A (ref. 7). 
e, f, eIF3 core cryo-EM structure from the present study at 6 A after focused 
refinement. The elF3 core is labelled according to the anthropomorphic 
nomenclature. g, Unassigned density region of the eIF3 core cryo-EM 
structure, coloured in red, seen from three different orientations. The green 
surface represents most of the core region that was modelled. 
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Extended Data Figure 2 | Sorting of particle images and focused refinement 
of the five-lobed core of eIF3. a, Composition of elF3 purified from RRL for 
assembly of 43S complexes, resolved by SDS-PAGE and analysed by nano-LC- 
MS/MS to characterize truncation of elF3a and elF3c due to endoproteolytic 
cleavage. The intensity of labels in this panel reflects the intensity of bands 
corresponding to the truncated forms of eIF3a and eIF3c. The sequence of the 
N-terminal region of rabbit eIF3b has not been conclusively established (see 
Extended Data Fig. 9c) and numbering therefore refers to human eIF3b. 

b, Overview of the process of sorting particle images. The population of each 
class is indicated by the number of particles and the percentage of the total 
number of particles at the beginning of each of the two classification rounds. 
The DHX29-bound 43S complex was processed from a total of ~650,000 
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particle images, which were first sorted into ten different classes. Class 2 
(~125,000 particles) was sorted into ten subclasses, which are displayed in four 
different orientations, showing the intersubunit face, front, solvent side and 
bottom, respectively. Classes 2-7 and 9 from the second classification round 
were pooled and refined yielding a reconstruction presenting a variable 
resolution ranging from 4.5 to 15 A (bottom right). d, f, Cryo-EM 
reconstruction of the 43S complex, coloured according to the local resolution as 
measured using ResMap (see Methods). The red and black boxes correspond 
to close-up views of the eIF3 core viewed from the intersubunit face (d) and 
the solvent face (f) of the 40S subunit. c, e, CryoEM reconstruction of the eIF3 
core after focused refinement, coloured according to the local resolution as 
measured using ResMap. Maps in c-f are filtered to 6 A. 
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Extended Data Figure 3 | Comparison of the structure of the mammalian _ between eIF3c and elF3e, which occur neither in COP9 nor in 26S lid 

elF3 core with the structures of the COP9 signalosome and 26S proteasome _ molecules. 1, Consequence of these additional quaternary interactions on the 
lid molecules. a—c, Close-up views of the helical bundles of eIF3, COP9 and 26S _ structure of eIF3 subunits a, c and e, schematized as rectangles. Black arrows 
lid molecules. di, All three molecules shown in two different orientations. around axes describe the rotation of the helical domains of subunits a, c and e 
Different constitutive subunits are labelled and coloured variably. Homologous __ relatively to their respective PCI domains, owing to the existence of these 


subunits among all three molecules are shaded using the same colour in all additional quaternary interactions, compared to COP9 and 26S lid molecules. 
panels. j, k, Additional quaternary interactions between eIF3a and eIF3c, and 
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Extended Data Figure 4 | Conservation of quaternary interactions between 
el F3a and eIF3c subunits, and between elF3c and elF3e subunits. a, Ribbon 
representation of eIF3a and eIF3c subunits. b-d, Close-up views of contact 
regions between eIF3a and elF3c. e, Ribbon representation of eIF3 c and e 
subunits. f, g, Close-up views of contact regions between subunits c and e of 
eIF3. Red spheres represent residues at the interfaces that are conserved in 
elF3 from six representative eukaryotic organisms; H. sapiens, C. elegans, 

A. thaliana, D. melanogaster and X. tropicalis, which are very different 
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multicellular eukaryotic organisms, and N. crassa, a unicellular organism. 
These organisms all have a full complement of 13 eIF3 subunits. Orange 
spheres represent residues at the interfaces that are conserved only in the five 
multicellular eukaryotic organisms. The remaining residues that are suggested 
by the model and the density map to be involved in quaternary interactions 
are represented as ribbons in salmon colour. Many of these other residues are 
conserved in three or four of the compared organisms, but almost all of them 
have conserved chemical properties. 
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Extended Data Figure 5 | Conservation of quaternary interactions between 
eIF3 subunits a, m, f and h. a, In centre, eIF3 a, m, f and h subunits. b-f, 
Close-up views of contact regions between subunits a, m, f and h of eIF3. Red 
spheres represent residues at the interfaces that are conserved in six 
representative eukaryotic organisms; H. sapiens, C. elegans, A. thaliana, 

D. melanogaster, X. tropicalis and the unicellular N. crassa. Orange spheres 
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represent residues at the interfaces that are conserved only in the five 
multicellular eukaryotic organisms. The remaining residues that are suggested 
by the model and the density map to be involved in quaternary interactions are 
represented as ribbons in salmon colour. Many of these other residues are 
conserved in three or four of the compared organisms, but almost all of them 
have conserved chemical properties. 
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Extended Data Figure 6 | Conservation of quaternary interactions between 
eIF3 subunits e, k and 1, and between eIF3 subunits in the region of the 
helical bundle. a, Ribbon representation of eIF3 e, k and | subunits. b-f, 
Close-up views of contact regions between subunits e, k and 1 of eIF3. g, Ribbon 
representation of eIF3 octamer core. h, Close-up views of contact regions 
between subunits a, c, e, m, f, h, k and 1 of eIF3 in the helical bundle region, seen 
from the direction of the axis of the latter. i, Region displayed in b rotated by 
90°. Red spheres represent residues at the interfaces that are conserved in six 
representative eukaryotic organisms; H. sapiens, C. elegans, A. thaliana, D. 
melanogaster, X. tropicalis and the unicellular N. crassa. Orange spheres 


represent residues at the interfaces that are conserved only in the five 
multicellular eukaryotic organisms. The remaining residues that are suggested 
by the model and the density map to be involved in quaternary interactions are 
represented as ribbons in salmon colour. Many of these other residues are 
conserved in three or four of the compared organisms, but almost all of them 
have conserved chemical properties. j, Same view as in i displaying all the 
hydrophobic residues of the helical bundle region in silver ribbons. The 
abundance of hydrophobic residues in the helical bundle at the interfaces of 
different helices suggests the stabilization of the bundle though hydrophobic 
interactions, hence the low identity conservation of residues at the interfaces. 
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Extended Data Figure 7 | Comparison of the mammalian eIF3 core model 
built from the 6 A cryo-EM reconstruction with a model based on low- 
resolution cryo-EM studies. In the centre, our polyalanine-level model of the 
mammalian eIF3 octamer core (represented in red ribbons) fitted on the 
atomic model proposed previously'* (represented in dark grey ribbons), shown 
in two different orientations. The surrounding panels are close-up views of 
different constitutive subunits, highlighting notable structural differences 
between the previously proposed model'* and the model proposed in this study. 
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Gold- and cyan-dashed ovals highlight the misassignment of several helices of 
the helical bundle belonging to the C termini of subunits h and f, respectively, 
in the previously proposed model. Red-, green-dashed circles and black 
arrowhead highlight the absence in the previously proposed model of 
important structural features involved in quaternary interactions. In each 
panel, the remaining subunits of the eIF3 core octamer are faded out as 
transparent ribbons. 
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Extended Data Figure 8 | Sorting of particle images and focused 
classification of eIF3 peripheral subunits near the mRNA channel entrance 
and exit. a, Overview of the process of sorting particle images. The 
population of each class is indicated by the number of particles and the 
percentage of the total number of particles at the beginning of each of the 
classification rounds. After the first round of classification, class 2 stands out as 
the class displaying eIF3. Focused classification of eI[F3 peripheral subunits, 
near the mRNA channel entrance, was performed by applying a smooth-edge 
mask corresponding to the shape of the concerned subunits of eIF3 and to a 
region of the 43S complex encompassing DHX29 and h16 of the 40S subunit. 
The mask is displayed as pink mesh. The resulting classes from the focused 
sorting of class 2 of the first classification round are displayed in two different 
orientations, front and solvent side, respectively. Class 6 from the second 
classification round presents the most solid and complete density of the 
peripheral subunits of eIF3 at this region of the complex, and it was therefore 
refined yielding a reconstruction presenting an average resolution of 7.1 A. 
Cryo-EM reconstruction of eIF3b and eIF3i along with DHX29, coloured by 
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local resolution. b, After the first round of classification, on class 2, focused 
classification of an elF3 peripheral subunit, identified as eIF3d, near the mRNA 
channel exit behind ribosomal protein RACK1, was performed by 

applying a smooth-edge mask corresponding to the shape of the concerned 
subunit of eIF3 and to a region of the head of the 40S subunit encompassing 
RACK1. The mask is displayed as pink mesh. The resulting classes from the 
focused sorting of class 2 of the first classification round are displayed in two 
different orientations, intersubunit face and top, respectively. Class 2 

from the second round of classification presents the most solid and complete 
density of the peripheral subunit of eIF3 at this region of the complex, but due 
to some apparent heterogeneity in eIF3d, a third classification round was 
required, yielding four classes displaying a solid elF3d subunit in slightly 
different conformations (other classes obtained in this third round of 
classification were completely empty and therefore not shown). The major class 
(39% of the particles in this round) yielded a reconstruction presenting an 
average resolution of 7.7 A. Cryo-EM reconstruction of elF3d, coloured by local 
resolution. 
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Extended Data Figure 9 | Shape and ribosomal binding site of eIF3d, and 
eIF3b sequence. a, Segmented cryo-EM reconstruction of the peripheral eIF3d 
subunit localized on the head of the 40S subunit, behind ribosomal protein 
RACK1, displayed at a high density threshold to show its most solid features, in 
four different orientations. b, e[F3d subunit in the context of the 43S 
preinitiation complex, seen from the back, showing the ribosomal proteins that 
contact it directly. This figure is complementary to Fig. 4c as it displays the same 
complex in a different orientation. This panel shows contacts between eIF3d 
subunit and ribosomal proteins eS28, uS7 and uS9. Contacts with RACK1 
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cannot be seen from this orientation. c, eIF3b sequence. The amino acid 
sequences of human elF3b (GenBank NP_003742.2) and rabbit e[F3b 
(UniProt G1SZ03_RABIT) aligned using T_COFFEE (http://www.ebi.ac.uk/ 
Tools/msa/tcoffee/) and annotated to show identity with tryptic and 
chymotryptic peptides derived from purified rabbit elF3 and identified by 
nano-LC-MS/MS analysis. The complete sequence of rabbit eIF3b has not been 
determined, but clearly extends beyond the N terminus of G1SZ03_RABIT, 
and we therefore used the numbering of residues in human eIF3b when 
referring in the text to elements of rabbit eIF3b. 
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Extended Data Table 1 | Peptide analysis of elF3 a and c subunits after digestion and details of interactions between elF3 and the 40S small 


ribosomal subunit 


a 
Digestion with trypsin (RABBIT) Digestion with chymotrypsin (RABBIT) 
1st peptide Last peptide —*Spc/ Peps SC% 1st peptide Last peptide —_#Spe/#Peps 
2-14 1,348 - 1,362 2-14 1,348 - 1,362 
M.PAYFQRPENALKRA | R.TKNETDEDGWTTVRR- | 176/56 | 285% | wpayFQRPENALKRA | RTKNETDEDGWTTVRR-| 222/71 
2-14 1,092 - 1,100 2-14 1,101 - 1,110 
M.PAYFQRPENALKR.A RNADDDRIPRR 192/61 30.4% | wv paveQRPENALKRA | © RRGADDDRGPWR ‘149/59 
2-14 1,092 - 1,100 SBIR SSS 2-14 1,101 - 1,110 3 
M.PAYFQRPENALKR.A RNADDDRIPR.R / °% | M.PAYFQRPENALKRA RRGADDDRGPWR | 34/7 
2-14 1,023 - 1,030 858/179 | 19.6% 2-14 1,050 - 1,061 ger 
M.PAYFQRPENALKR.A L.DDDRGSWR.T M.PAYFQRPENALKR.A | RRGGADDERPSWRS / 
2-14 1,021 - 1,030 2-14 1,021 - 1,030 
654/133 | 44.6% 1230 / 242 
M.PAYFQRPENALKR.A R.GLDDDRGSWR.T ‘ M.PAYFQRPENALKR.A R.GLDDDRGSWR.T / 
2-14 980 - 990 168/53 29.7% 2-14 1,001 - 1,010 Par 
M.PAYFQRPENALKR.A RRGLEDERPSWRS M.PAYFQRPENALKR.A RIGEEDRGSWR.H / 
15-24 980 - 990 46/23 15.0% 15-24 1,001 - 1,010 70/32 
RANEFLEVGKK.Q R.RGLEDERPSWRS | RANEFLEVGKK.Q RIGEEDRGSWR.H 
2-14 878-885 20/13 | SER 2-14 878 - 885 
M.PAYFQRPENALKRA RLGEDPLSRR / M.PAYFQRPENALKR.A RLGEDPLSR.R pad fat 
eIF3C 
Digestion with trypsin (RABBIT) Digestion with chymotrypsin (RABBIT) 
1st peptide Last peptide | #Spe/#Peps SC% 1st peptide Last peptide #Spe / Peps SC% 
95-104 902-913 18/10 95-104 902-913 14/7 
KSIVDKEGVPRF R.GGYRQQQSQTAY.- KSIVDKEGVPR.F RGGYRQQQSQTAY.- 
95 - 104 902-913 30/14 95 - 104 902 - 913 aoe 
KSIVDKEGVPR.F R.GGYRQQQSQTAY.- KSIVDKEGVPR.F RGGYRQQQSQTAY.- / 
95 - 104 902-913 a7e 95-104 902-913 
K.SIVDKEGVPR.E K.GUYRQUYSQTAY- KSIVDKEGVPR.F RGGYRQQQSQTAY.- se? 
95 - 104 902-913 39/20 34-47 902-913 asia 
KSIVDKEGVPR.F R.GGYRQQQSQTAY.- K.QPLLLSEDEEDTKRV — RGGYRQQQSQTAY.- / 
95-104 902-913 23-31 | 902 - 913 
KSIVDKEGVPRF R.GGYRQQQSQTAY.- HESS L.VTKPVGGNY.G RGGYRQQQSQTAY.- 205/78 
2-33 2-33 
902 - 913 902-913 
M.SRFFTTGSDSESESSLSGEELVTK 861 / 108 MSRFFTTGSDSESESSLSGEELVTK 1219 / 204 
PVGGNYGKQ R.GGYRQQQSQTAY.- / PVGGNYGK.Q | RGGYRQQQSQTAY.- / 
57-65 902-913 23-31 902-913 
79/34 
RFEELTNLIR. R.GGYRQQQSQTAY.- ba ba LVTKPVGGNY.G R.GGYRQQQSQTAY.- / 
T 
95 - 104 902-913 95-104 902-913 
369 / 52 
KSIVDKEGVPR.F R.GGYRQQQSQTAY.- / KSIVDKEGVPR.F RGGYRQQQSQTAY.- 385/94 
b 


P8,E9,L12,K1 
F39,N40,141,L74,D75,N76,D77,E 
78, 1189,P190 

K23,K24,Q25 
D191,S192,K 


3,A15,N16,L19,E20 


195,D196 


L333,N334,1336,L337,1379,A380 


Y383,D384,T446,L447,E449,R450 
V57,G58,C59,S60,T61 


G341,K342,K343 
T52,V53,V54,C64,Q65,P66 


R560,T561,D562,R563,D602,P603, 
P604,K664,V665,R668,R669 
P38,G39,C40,Y41,E75,G76,C77 


P8,E9 
F59,D60,A61 
uS15 N433,V434,435 
K42,K43 
il a cugneAiza 6 


G118,L119,A120,P164, 
H65,L66,A67,K68 L389,A390,T391 
G1121,U1120,A1119 (ES7S) 
K342,K343,G344 
C930,G929 ,G928 (h22) 


Y165,G166 
U1114,U1115 (ES75) 
a, Peptide analysis of elF3 a and c subunits, for each band of the gel the first and last identified peptides from elF3 a and c subunits are indicated, after both types of digestions, in trypsin and chymotrypsin. The 
nano-LC-MS/MS analysis reveals the different forms of elF3 a and c subunits in our in vitro reconstituted 43S preinitiation complex. #Spc = number of specters, #peps = number of peptides. 
b, Details of interactions between elF3 and the 40S small ribosomal subunit. The name and sequential number of interacting residues on each side, elF3 and the 40S, is shown. Residue names are coloured variably 
to distinguish them according to their origin. 
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The formation of submillimetre-bright galaxies 
from gas infall over a billion years 


Desika Narayanan’, Matthew Turk’, Robert Feldmann*, Thomas Robitaille’, Philip Hopkins’, Robert Thompson®’, 
Christopher Hayward”, David Ball”’, Claude-André Faucher-Giguére’? & DuSan Keres" 


Submillimetre-bright galaxies at high redshift are the most luminous, 
heavily star-forming galaxies in the Universe’ and are characterized 
by prodigious emission in the far-infrared, with a flux of at least five 
millijanskys at a wavelength of 850 micrometres. They reside in haloes 
with masses about 10'° times that of the Sun”, have low gas fractions 
compared to main-sequence disks at a comparable redshift’, trace 
complex environments** and are not easily observable at optical 
wavelengths’. Their physical origin remains unclear. Simulations 
have been able to form galaxies with the requisite luminosities, but 
have otherwise been unable to simultaneously match the stellar 
masses, star formation rates, gas fractions and environments” “°. 
Here we report a cosmological hydrodynamic galaxy formation simu- 
lation that is able to form a submillimetre galaxy that simultaneously 
satisfies the broad range of observed physical constraints. We find 
that groups of galaxies residing in massive dark matter haloes have 
increasing rates of star formation that peak at collective rates of about 
500-1,000 solar masses per year at redshifts of two to three, by which 
time the interstellar medium is sufficiently enriched with metals that 
the region may be observed as a submillimetre-selected system. The 
intense star formation rates are fuelled in part by the infall of a 
reservoir gas supply enabled by stellar feedback at earlier times, not 
through major mergers. With a lifetime of nearly a billion years, our 
simulations show that the submillimetre-bright phase of high-red- 
shift galaxies is prolonged and associated with significant mass 
buildup in early-Universe proto-clusters, and that many submilli- 
metre-bright galaxies are composed of numerous unresolved compo- 
nents (for which there is some observational evidence’’). 

We conducted our cosmological hydrodynamic galaxy formation 
simulations using the new hydrodynamic code GIZMO”, which 
includes a model for the impact of stellar radiative and thermal pressure 
on the multiphase interstellar medium (ISM). This feedback both reg- 
ulates the star formation rate (SFR), and shapes the structure in the ISM. 
Informed by clustering measurements of observed submillimetre gal- 
axies (SMGs)’, we focus on a massive halo (with a dark matter mass of 
Mpm ~ 10’? Mo at z= 2, where Mo is the solar mass and z is the 
redshift) with baryonic particle mass Mpary ~ 10° Mo as the host of 
our ‘main galaxy’, and run the simulation to z = 2. The only condition 
of the tracked galaxy that is pre-selected to match the physical properties 
of observed SMGs is the chosen halo mass. We combine this with a new 
dust radiation transport package, POWDERDAY, that simulates the 
traverse of stellar photons through the dusty ISM of the galaxy, allowing 
us to robustly translate our hydrodynamic simulation into observable 
measures. We simulate the radiative transfer from a 200 kpc region 
around the main galaxy. This simulation represents the first cosmolog- 
ical model of a galaxy this massive to be explicitly coupled with dust 
radiative transfer calculations. The two codes and the simulation set-up 
are fully described in Methods. 


We define two distinct regions in the simulations. The “submilli- 
metre emission region’ is the 200 kpc region surrounding the central 
galaxy in the halo of interest. This is the region where all of the mod- 
elled 850 1m emission comes from, and is what relates most directly to 
observations. The ‘submillimetre galaxy’ refers to the central galaxy in 
the halo. Physical quantities from the submillimetre galaxy are most 
applicable to high-resolution observations, as well as to placing these 
models in the context of other theoretical galaxy formation models. As 
we will show, the submillimetre emission from the region is generally 
dominated by the central submillimetre galaxy, though the contri- 
bution from lower mass galaxies is often non-negligible. 

We track the submillimetre properties of the galaxies within the 
region from z ~ 6. The SFRs of galaxies in the region rise from this 
redshift towards later times z ~ 2, owing to accretion of gas from the 
intergalactic medium (Fig. 1). As stars form, stellar feedback-driven 
galactic winds generate outflows and fountains, allowing recycled gas 
to be available for star formation at later times (Extended Data Fig. 1). 
This phenomenon shapes a star formation history that is still rising at 
z= 2, in contrast to galaxy formation models with more traditional 
implementations of subresolution feedback, which peak at z ~ 3-6 for 
galaxies of this mass'*’*. Mergers and global instabilities drive short- 
term variability in the global SFR, while outflows and infall driven by 
the feedback model can affect features in the star formation history ina 
somewhat cyclical ‘saw-tooth’ pattern. 

At its earliest stages (z ~ 4-6), the integrated SFR from the galaxies 
in the region varies in the approximate range (100-300)Mo yr _‘, with 
a significant stellar mass, (0.5-1) X 10'1M ©, in place, comparable to 
some high-redshift detections’®. Feedback from massive stars enriches 
the ISM with metals, and the dust content simultaneously rises. By 
z= 3, the combination of gas accumulation and substantial metal 
enrichment drives an increase in the dust mass by a factor of ~50, 
with masses approaching ~1 X 10°Mo. Radiation from the delayed 
peak in the SFR interacting with this substantive dust reservoir drives 
the observed 850m flux density to detectable values of >5 mJy. 
The galaxies associated with the main halo enter a long-lived submilli- 
metre-luminous phase, with a lifetime of ~0.75 Gyr. While our main 
model is only run to z= 2 owing to computational restrictions for 
models of this resolution, tests with lower-resolution models reveal 
that at later times (z<1.5), a declining SFR due to inefficient 
accretion as well as exhausted gas supply drives a drop in the submilli- 
metre flux density (for more details, see Methods). The star formation 
history of galaxies residing in haloes with Mp ~ 10° Mo (at z= 2), 
as controlled by the underlying stellar feedback, provides a physical 
explanation for the peak in the observed SMG redshift distribution at 
z= 2-3 (ref. 17). 

During the submillimetre-luminous phase, the emitting region is 
almost always occupied by multiple detectable galaxies. In Fig. 2, we 
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Figure 1 | Evolution of physical and observable properties of the 
submillimetre emission region and the central galaxy. In each panel, the 
properties of the 200 kpc submillimetre emission region are shown with thick 
solid lines, while those of the central galaxy are given by thin dashed lines. 

a, Stellar and dust mass; b, SFR; c, predicted observed 850 jm flux density; 

d, specific SFR, sSFR (M+/SFR). The SER is averaged on 50 Myr timescales, and 


present gas surface density projections of six arbitrarily chosen snap- 
shots during the evolution of the submillimetre-luminous phase 
(z = 2-3). The panels are 250 kpc on a side; for reference, the full- 
width at half-maximum (FWHM) ofthe Submillimetre Common-user 
Bolometer Array (SCUBA) on the James Clerk Maxwell Telescope, the 
first instrument to detect SMGs, is ~125 kpc at z ~ 2. Multiple clumps 
of gas falling into the central galaxy are nearly always present. The 
observed flux density from the region is typically dominated by the 
central galaxy, with (on average) ~30% arising from emission from 
subhaloes (Extended Data Fig. 2). The submillimetre flux density of the 
central galaxy rises dramatically between z ~ 2-3, reaching a peak 
value of ~20 mJy. Owing to contributions from subhaloes surround- 
ing the central galaxy, the flux from the overall 200 kpc region can 
exceed this, peaking at ~30 mJy. Similarly extreme systems have 
recently been detected with the Herschel Space Observatory and the 
South Pole Telescope*’*”. 

While the central galaxy is being bombarded by subhaloes over a 
range of mass ratios during the submillimetre-luminous phase, major 
galaxy mergers akin to local prototypical analogues such as Arp 220 or 
NGC 6240 do not drive the onset of the long-lived submillimetre- 
luminous phase in the central galaxy. In Fig. 1, we highlight when 
the galaxy undergoes a major merger with mass ratio =1:3. While 
major mergers are common at early times (and indeed drive some 
short-lived bursts in star formation), the bulk of the submillimetre- 
luminous phase at later times (z ~ 2-3) occurs nearly a gigayear after 
the last major merger. The ratio of the SFR to its integral over cosmic 
time (the specific SFR) of the overall emitting region is generally on the 
main sequence of galaxy formation at z ~ 2 (defined as the main locus 
of points on the SFR-stellar mass (M«) relation), although the central 
galaxy can have values comparable both to main-sequence galaxies 


includes a correction factor of 0.7 for mass loss. Locations of major galaxy 
mergers (>1:3) are noted by green vertical ticks on the top axis of b. The light 
purple shaded region in c shows when the galaxy would be detectable as an 
SMG with SCUBA (Sgs9 > 5 mJy). The pink and purple shaded regions in 

d show the rough ranges for the z = 2 main-sequence (MS) and starburst 
regime; the grey region in d denotes below the main sequence. 


between z ~ 2-3 and to outliers. One consequence of a model in which 
SMGs typically lie on the main sequence of star formation is that 
the gas surface densities show a broad range, ~(107-10°)Mo pe? 
(Extended Data Fig. 3), as well as diverse gas spatial extents (Fig. 3). 
This is manifested observationally in the broad swath occupied 
by SMGs in the Kennicutt-Schmidt star formation relation’. The 
spatial extent and surface density of the gas are to be contrasted, 
however, with local merger-driven ultraluminous infrared galaxies, 
which exhibit typical FWHM radii of ~100—500pc (ref. 20). 
Idealized galaxy merger simulations with initial conditions designed 
to form SMGs further underscore this contrast, as they also result in 
compact morphologies during final coalescence, and can be inefficient 
producers of submillimetre radiation owing to increased dust 
temperatures®. 

The central submillimetre galaxy is amongst the most massive and 
highly star-forming of galaxies at this epoch. The stellar masses are 
diverse, in the range ~(1-5) X 10'Mo, comparable to recent mea- 
surements of this population”', and consistent with constraints from 
abundance matching techniques”. The molecular gas fractions of the 
central galaxy (fgas = Mu2/(Mu + M+)) decline with stellar mass, and 
range from ~40% at lower stellar masses to 10% at the highest 
masses. This is in agreement with observations”’, although is depend- 
ent on the conversion from carbon monoxide ('*CO) luminosity to H2 
gas mass. We note that these predictions are quantitatively different 
from those produced by previous cosmological efforts in this field, with 
some predicted gas fractions exceeding f,., = 0.75 (refs 7, 9) and med- 
ian stellar masses as low as ~10'°M @ (ref. 7). We present plots of the 
gas fractions and calculated spectral energy distributions (SEDs) of our 
model SMG in the context of observations in Extended Data Figs 4 
and 5. The modelled gas distributions within the central galaxy, which 
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Figure 2 | Surface density projection maps of the 250 kpc region around 
the central submillimetre galaxy for redshifts z~ 2-3. The submillimetre 
emission region probed in surveys typically encompasses a central galaxy in a 
massive halo that is undergoing a protracted bombardment phase by numerous 


range from ~1kpc to 8kpc, compare well with recent dust maps 
observed using the Atacama Large Millimetre Array”*. 

The stellar masses, gas fractions and lifetimes are in agreement 
with some previous lower-resolution cosmological efforts’, although 
the predicted SFR and luminosity from this model are substantially 
larger. The SFR of the group of galaxies in the region peaks at 
~1,500Mo yr‘. Importantly, up to half of the total infrared luminosity 
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Figure 3 | Gas and stellar radius distribution for the central submillimetre 
galaxy. The orange histogram denotes the half-mass radius of the stars, 

while the blue shows the gas. The galaxy gas is more distributed in the central 
galaxy than the (subkiloparsec) extent expected from major mergers, although 
still sufficiently compact that it will remain unresolved even with approximately 
arcsecond resolution. The ordinate is weighted by the time the galaxy spends 
in the bin, and the overall normalization of the distribution is arbitrary. 
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subhaloes. Some of the brightest SMGs arise from numerous galaxies within the 
beam in a rich environment (bottom right panel). The colour coding denotes 
the gas column density (Nj), with the colour bar on the right. 


can come from older stars with ages tage > 0.1 Gyr. Using standard 
conversions”, the estimated SFR from the integrated infrared SED 
(3-1,100 um) can exceed ~3,000Mo yr | (Extended Data Fig. 6), and 
hence infrared-based SFR derivations of dusty galaxies at high z may 
over-estimate the true SFR by a factor of ~2. Indeed, the contribution 
of satellite galaxies to the global SFR, along with the contribution of old 
stars to the infrared luminosity may relieve some tensions between the 
inferred SFRs from submillimetre galaxies and massive galaxies mod- 
elled in cosmological hydrodynamic simulations”. 

The end-product of the central submillimetre galaxy at z~2 is a 
galaxy with a stellar mass of ~ (4-5) < 10''M © that is distributed over a 
compact region of ~1-5kpc, and gas that is distributed similarly 
(Fig. 3). This is similar in extent and mass to the observed z ~ 2 com- 
pact quiescent galaxy population, which has a mean half-light radius of 
R.~ 1.5 kpc, a stellar mass of Mx > 10''Mo and AgES tage ~ 0.5—-1 Gyr 
(ref. 26), suggesting a plausible connection between the galaxy popula- 
tions. Indeed, a calculation of the stellar velocity dispersion along three 
orthogonal sightlines of the central galaxy during the submillimetre- 
luminous phase results in o+ + 600-700 kms ', comparable to mea- 
surements of high-z compact quiescents. A large sample of simulated 
SMGs would allow for a robust analysis of the expected abundances of 
SMGs and compact quiescents at z ~ 2. 

Our picture for the formation of SMGs suggests that they are not 
transient events, but rather natural long-lived phases in the evolution of 
massive haloes. The ~0.75 Gyr duty cycle combined with the comoving 
abundance” of dark matter haloes of this mass result in an expected 
abundance of our model SMGs of ~1.5 X 10° °h® Mpc °, comparable 
to the ~10° °h° Mpc’ ° observed for SMGs”*. While modelling the full 
number counts involves convolving the typical duty cycle as a function 
of halo mass with halo mass functions over a range of redshifts, the 
approximate abundances implied by this model are encouraging. 

This model suggests that galaxies that form in haloes of mass 
Mpm ~ 10'*Mo at z= 0 will represent typical SMGs near the peak 
of their redshift distribution. Lower mass galaxy models do not achieve 
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the requisite SFR and metal enrichment to generate submillimetre- 
luminous galaxies (see Methods). More extreme SMGs that are being 
detected for z = 5-6 (refs 29, 30) may form in even more massive (and 
rare) haloes than those considered here. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Cosmological hydrodynamic zoom simulations. We utilize a newly developed 
version of TreeSPH that employs a pressure-entropy formulation of smoothed 
particle hydrodynamics (SPH)”’ that obviates many of the potential discrepancies 
noted between grid-based codes, traditional SPH codes, and moving-mesh algo- 
rithms. In particular, we employ the hydrodynamic code GIZMO” in P-SPH 
mode which conserves momentum, energy, angular momentum and entropy, and 
includes newly developed algorithms to treat the artificial viscosity, entropy dif- 
fusion and time-stepping’'*’. The gravity solver is a modified version of the 
GADGET-3 solver’, and an updated softening kernel to better represent the 
potential of the SPH smoothing kernel is included”. 

The simulations are fully cosmological zoom-in calculations of the evolution of 
individual galaxies. A 144 Mpc’ cosmological volume was simulated at low reso- 
lution down to redshift z= 0 with dark matter only. The halo of interest was 
identified, and re-simulated at much higher resolution with baryons included. 
The initial conditions were generated with the MUSIC code”*. We simulate four 
zoom galaxies—one is our main galaxy, and the other three are at varying resolu- 
tions and masses for the purposes of testing. The main galaxy of interest to this 
study resides in a dark matter halo mass of Mpy = 3 X 10'°Mq at z= 2. The 
initial baryonic particle masses in the high-resolution region were 2.7 X 10°M > 
and the minimum baryonic/stars/dark matter force softening lengths were 9/21/ 
142 proper pc at z = 2. The physical properties of all of the modelled galaxies are 
presented in Extended Data Table 1. 

The baryonic physics implemented into GIZMO are developed on the basis of 
extensive tests studying idealized simulations of both isolated disks and galaxy 
mergers” “’, The gas cools using an updated cooling curve to standard* imple- 
mentations in SPH codes which includes both atomic and molecular line emis- 
sion*®. The modelled ISM is multiphase. The neutral ISM is broken into an atomic 
and molecular component following algorithms that scale the molecular fraction 
with column density and gas phase metallicity*”*. Star formation occurs in 
molecular gas above a threshold density (here, this is set to mhresh = 10 cm *). 
Star formation is further restricted to gas that is locally self-gravitating, where: 


2 2 
pom hina aa _ <1 (1) 


where « is the usual virial parameter, f’ ~ 1/2, v the gas velocity, G is the gravita- 
tional constant and p is the density of the gas. This follows from studies” that show 
that the predicted spatial distribution of star formation in galaxies is more realistic 
when using a gas self-gravitating criterion compared to a variety of other algo- 
rithms (including a fixed density threshold, a pure molecular-gas law, a temper- 
ature threshold, a Jeans criterion, a cooling-time criterion and a converging flow 
criterion). The SFR follows a volumetric relation: 


Px = Pmot/ tit (2) 


where p is the star formation rate density, Poi the gas density, and ty the local gas 
free fall time. In other words, stars are allowed to form with 100% efficiency per free 
fall time. The star formation is subsequently self-regulated by stellar feedback, 
resulting in a time-averaged efficiency on galaxy scales of €- of ~0.005-0.1 (ref. 39). 
Once stars have formed, they can impact the ISM via various feedback mechan- 
isms. Assuming a Kroupa” stellar initial mass function, and using STARBURST99"" 
for luminosity, mass-return and supernova rate calculations as a function of stellar 
age and metallicity, we include the following forms of stellar feedback. 
Radiation momentum deposition. At each timestep, the gas near young stars is 
affected by a momentum flux given by: 


Prad ~(1 - exp( = TUV/optical) (1 oF TR )Lincident /€ (3) 


where Lincident is the incident luminosity, tyv/optical is the optical depth to UV/optical 
photons, tz = Ygaskizs Xgas is the column of gas and kp = 5(Z/Za) g ‘cm’. 
Supernovae and stellar winds. We utilize tabulated type-1 and type-II supernovae 
rates’; if a supernova occurs during a timestep, thermal energy and radial 
momentum are injected within a smoothing length of the star. Gas and metal 
return are included as well. Stellar winds are similarly included with energy, wind 
momentum, mass and metals deposited within a smoothing length. 
Photoheating of H 11 regions. The production rate of ionizing radiation from stars 
determines the extent of H 11 regions (allowing for overlapping regions). These 
regions are heated to 10°K if the gas is below that threshold. 

We utilize models TL37 SR and TL37 HR in Extended Data Table 1 to test the 
convergence properties of our simulations. One model is run with the same mass 
baryonic resolution as our main model (standard resolution; SR), and one a factor 
of ~8 higher resolution (high resolution; HR). In Extended Data Fig. 7, we show 
the modelled duty cycle above a given flux density as a function of flux density for 
these two models. We see that the shortest lived (<200 Myr) emission spikes 


present in the standard resolution model may not be converged in the highest 
resolution model. Notably, emission with longer duty cycles is either converged, or 
underpredicted in our standard resolution model, suggesting that the relatively 
long-lived submillimetre-luminous phase is robust. 

We show the M:-~z relation for the central galaxy in Extended Data Fig. 8 as 

compared to observational constraints”. The central galaxy has a stellar mass a 
factor of ~2 greater than the observed median stellar mass for comparable mass 
haloes at this epoch. The model galaxy may represent an outlier in the Ms-z 
relation at these redshifts. Indeed, the thickness of the observational constraints 
shows the uncertainty, not range, in possible values. Alternatively, it is possible 
that the inclusion of feedback from an active galactic nucleus (AGN) could impact 
the stellar mass buildup in the galaxy, although the level to which black hole 
growth can impact star formation near the submillimetre-luminous phase is 
unclear. Some models have shown that AGN can grow efficiently in the absence 
of major mergers*’**, while other models and observations suggest that mergers 
may be necessary to grow massive holes**-*’. The last major merger before the 
submillimetre-luminous phase is ~1 Gyr before. Tests with our low resolution 
model (m13m14) show that without AGN feedback, residual star formation drives 
a factor of ~2 increase in stellar mass at late times (z ~ 0-1). Finally, we note that 
a higher mass resolution model could potentially also result in decreased final 
stellar masses. In our convergence tests, the final M» (at z = 2) of the HR run is 
~60% that of the SR run. 
Dust radiative transfer calculations. To calculate the inferred observational 
properties of our simulated galaxies, we developed a dust radiative transfer pack- 
age, POWDERDAY. In short, POWDERDAY takes hydrodynamic simulations of 
galaxies in evolution, projects the gas properties onto an adaptive mesh and 
calculates the radiative transfer from the stellar sources through the dusty ISM 
until an equilibrium dust temperature is achieved. 

In detail, we identify galaxies using SKID to locate bound groups of baryonic 
particles***’, and track their progenitors back in time®**!. Galaxies and haloes are 
required to contain at least 64 particles each in order to be identified. We cut out a 
200 kpc (side length) region around the galaxy of interest, and subdivide the 
domain into an adaptive grid with an octree memory structure. Formally, we begin 
with one cell encompassing the entire 8 X 10° kpc’ radiative transfer region. The 
cells then recursively subdivide into octs until there are a threshold maximum 
number of gas particles in the cell (we employ Mubdivide.thresh = 64, although experi- 
ments with Meubdividethresh = 32 Show converged results). The physical properties of 
the gas particles are projected onto the octree using a spline smoothing kernel. 

The spectral energy distribution of stars are calculated on the fly with the 
Flexible Stellar Population Synthesis code, FSPS*™, through PYTHON-FSPS, a 
set of PYTHON hooks for FSPS (https://github.com/dfm/python-fsps). The SEDs 
are calculated as simple stellar populations with ages and metallicities determined 
by the hydrodynamic simulation, and assuming a Kroupa IMF. 

The radiative transfer happens in a Monte Carlo fashion using the three-dimen- 
sional dust radiative transfer solver, HYPERION®. The code uses an iterative 
methodology to determine the radiative equilibrium temperature’, and we deter- 
mine convergence when the energy absorbed by 99% of the cells has changed by 
less than 1% between iterations. We assume a dust grain-size distribution com- 
parable to that of the Milky Way”, with R = A,/E(B — V) = 3.15, where A, is the 
visual extinction and E(B— V) is the difference between the B- and V-band 
extinctions. The dust emissivities are updated to include an approximation for 
polycyclic aromatic hydrocarbons (PAHs) alongside thermal emission. We 
assume a constant dust to metals ratio of 0.4, motivated by both Milky Way 
and extragalactic observational constraints”. 

The underlying HYPERION code has passed the standard benchmarks for 

codes of this type”, and we found that POWDERDAY compares well against 
other publicly available dust radiative transfer codes’*”* in test starburst SPH 
galaxy merger simulations. 
Parameter choices. In Extended Data Fig. 9, we present a number of tests of our 
parameter choices for the radiative transfer calculations. We show the predicted 
850 um light curve from our lowest resolution model (m13m14) using fiducial 
parameters, as well as three parameter choice variations. 

We first ask whether our chosen radiative transfer grid size affects our principal 
results. Our fiducial model is a 200 kpc (on a side) box cut out of the global cos- 
mological simulation centred on the halo of interest. This size was chosen to reflect a 
rough average of the typical (sub) millimetre beam sizes typically used to detect SMGs. 
For example, assuming Planck 2013 cosmological parameters”*, the Submillimetre 
Common-Use Bolometer Array (SCUBA) on the James Clerk Maxwell Telescope 
(JCMT) has a 15” full-width at half-maximum (FWHM) beam at 850 pum. At z= 2 
this corresponds to ~128kpc. At the same redshift, the beam of AzTEC and 
LABOCA at 1mm on the JCMT corresponds to ~163 kpc (19”); the South Pole 
Telescope (SPT) has a beamsize of 540 kpc at 1.4mm (63”); and Herschel’s SPIRE 
instrument ranges from 154 kpc to 308 kpc (250-500 jum; 18”-36"). 
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Because a few notable beam sizes (of particular relevance, the SCUBA beam) are 
smaller than our assumed box size of 200 kpc, we have run an additional model 
with box length of 100 kpc (and all other parameters exactly the same). We 
highlight the resultant 850 jim light curve from this model in the top right panel 
of Extended Data Fig. 9. When comparing to our fiducial model, it is apparent that 
our results are robust to the highest resolution beams that have been used for SMG 
surveys at single dish facilities to date. 

We additionally investigate whether our inclusion of PAHs in our model makes 
any difference to the calculated submillimetre-wave flux density of our model 
galaxy. This is presented in the bottom left panel of Extended Data Fig. 9. 
Again, we note minimal impact on the submillimetre SED of our model. 

Finally, we ensure that our results are converged with the number of photons 
emitted. We fiducially run 10’ photons per grid (roughly 100 per cell). In the bottom 
right panel of Extended Data Fig. 9, we show the results from a run with 10° photons 
per grid, and show that the results are robust against this parameter choice. 
Relation to other models. Historically, the methods used, and physical models for 
SMG formation in numerical simulations are quite varied. Here, we summarize 
these methods and results, and place our own model into this context. Broadly, 
there are three classes of SMG formation models: cosmological semi-analytic 
models (SAMs), idealized non-cosmological simulations and cosmological hydro- 
dynamic models. The present model falls into the last category. Our model is the 
first self-consistent cosmological simulation with baryons and bona fide radiative 
transfer to form a submillimetre galaxy with physical properties comparable to 
those observed. 

The initial forays into this field were typically with SAMs. This is because SAMs 
are computationally inexpensive, and allow for a large search in physical para- 
meter space relatively easily. SAMs either utilize analytic halo merger trees, or 
directly simulate them, and then employ analytic prescriptions to describe the 
central galaxies. The Durham SAM”” couples galaxies formed in a semi-analytic 
model with dust radiative transfer. These simulations model galaxies that have 
axisymmetric geometries that consist of a disk and a bulge. Young stellar popula- 
tions are assumed to still be enveloped in their birth clouds, and thus experience 
additional attenuation. This model suggests that roughly ~22% of SMGs originate 
from major mergers, the remainder from minor mergers, and that the stellar IMF 
is flat during the starburst. The typical lifetime for the submillimetre-luminous 
phase is ~100 Myr (a factor of ~7.5 lower than found in our work), galaxies are 
extremely gas rich (f,,; ~ 75%), and stellar masses are a factor of ~10 lower than 
predicted by our model (Mx ~ 2. X 10'° M.). While the stellar masses of SMGs 
are debated’, the gas fractions appear to be uniformly lower in observa- 
tions*'*7>°8!, and a flat stellar IMF is probably ruled out by CO dynamical mass 
measurements”. 

As an alternative to cosmological simulations, a number of studies have explored 
SMG formation in idealized simulations*****. These studies evolve hydrodynamic 
models of idealized disks and mergers over a range of merger mass ratios, and 
combine these with dust radiative transfer simulations’*. These models infer halo 
masses and stellar masses for SMGs comparable to those modelled here. This said, 
in the idealized galaxy models, ~30%-70% of the SMGs (flux dependent) originate 
in merger-driven starbursts, substantially higher than what is found for our model. 
Some studies* have noted that binary mergers that cause SMGs may break up into 
multiples at high-resolution owing to the contribution to the total flux of individual 
inspiralling disks. Because idealized simulations are non-cosmological in nature, 
comparing the multiplicity inferred from these to our models is difficult: the major 
merger multiplicity can only be two when considering galaxies at the same redshift. 
On the other hand, Extended Data Fig. 2 suggests that potentially larger multipli- 
city can be observed for physically associated clumps. 

To fully capture the cosmic environment of SMGs during their formation, as 
well as their baryonic structure and morphology, cosmological hydrodynamic 
simulations are probably the best tool. Thus far, cosmological hydrodynamic 
simulations used to simulate SMGs have not employed direct radiative transfer 
models’’. As such, inferring when a galaxy is an SMG in cosmological simulations 
has necessitated the use of parameterized emission models, such as assumed grey- 
body emission laws’, or SFR thresholds'®. The physical properties for SMGs 
derived from the most extensive of these studies’® (that is, Ms, Mpm and fyas) 
are similar to the model presented here, although with roughly a factor of ~3 
difference in SFR. 

Code availability. We have made POWDERDAY available at https://bitbucke- 
t.org/desika/powderday, and GIZMO available at https://bitbucket.org/phopkins/ 
gizmo. 
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Extended Data Figure 1 | Mass of reservoir gas in the central galaxy that will _ particles that turn into stars during the SMG phase (z ~ 2-2.7), and is only 
be consumed during SMG starburst as a function of z. The colour scale measured for the central galaxy itself (that is, gas ejected into the halo is not 
denotes the median scale height from the galaxy centre of mass. The gas mass__ included). The SMG gas reservoir follows a cycle of being pushed outward 
consumed during the starburst is calculated by tracking the evolution of gas followed by re-accretion. 
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Extended Data Figure 2 | Distribution of flux density ratio of brightest 
component in submillimetre-luminous region to total flux density. The 
average is shown with the vertical line. Submillimetre-luminous regions often 


1.0 


break up into multiples. The region is generally dominated by one component, 
although smaller subhaloes can contribute on average ~30% of the observed 
flux density. The normalization of the ordinate, P, is arbitrary. 
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Extended Data Figure 3 | Gas surface density for the central submillimetre normalization is arbitrary. We predict that the submillimetre-luminous phases 


galaxy. The blue histogram shows the distribution of gas surface densities do not have dramatically different surface density distributions compared to 
(gas) during all phases (that is, all snapshots, Snaps), while the pink histogram _ the non-submillimetre-luminous phases. This prediction might have been 
shows the same for the submillimetre-luminous phase. The ordinate (N) is tentatively observed"*”. 


weighted by the time a galaxy spends in a given gas surface density bin, and the 
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stellar mass. Blue stars show individual snapshots of the central submillimetre _ selected on the basis of their B, z and K band luminosities.) Both the 
galaxy, while red circles with error bars (1c) show observations of BzK observations and our model show a declining molecular gas fraction (f,.;) with 
galaxies and SMGs with direct CO(J = 1-0) measurements (to avoid increasing galaxy mass (M+), with a typical range of f,., = 0.1-0.4 for galaxies of 
complications in converting from higher-lying CO rotational lines to the SMG mass. 
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the central submillimetre galaxy. The ordinate shows the flux density in mJy, coloured lines show the SEDs for individual submillimetre-luminous 
while the abscissa shows the wavelength in jm. The blue shaded region snapshots. The data and models are redshifted to a common redshift z = 2. 


Extended Data Figure 5 | Predicted spectral energy distribution (SED) for 


shows the range of SEDs for all simulation snapshots that satisfy the fiducial | The model and data compare well, and the model suggests a diverse range 
Fgsoum > 5 mJy submillimetre galaxy selection criteria, while the dark grey of SMG SEDs. 
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Extended Data Figure 6 | Overestimate of the SFR of high-z SMGs. The (SFRso). Up to an SFR of ~800 Mo yr! the two correspond well. At higher 
ordinate denotes the SFR as determined from the infrared SED (SFR;g)”, while | SFRs, however, there is a dramatic departure owing to substantial contribution 
the abscissa shows the SFR averaged over the last 50 Myr in the simulations to the infrared luminosity by older stars. 
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simulations. Lines show the 850 jim duty cycle above a given flux density asa _ a one-level-higher refinement. 
function of flux density for our resolution test models presented in Methods. SR 
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galaxy. The purple line shows model results, while the dark-blue filled region __ stellar mass of the galaxy is a factor of ~2 higher than the median observed 
shows observational constraints from an abundance matching assumption”. _ galaxy. 
The model and observations are in reasonable agreement, especially during 
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Extended Data Figure 9 | Tests of parameter choices for radiative transfer denotes Sgsy = 5 mJy, which is the canonical selection criteria for SMGs. Top 
calculations. The simulated galaxy for these tests is our lowest resolution left, our fiducial set of parameters; top right, simulation with a 100 kpc (on a 
cosmological simulation (m13m14). Each panel shows the 850 tim flux density side) emission region instead of 200 kpc; bottom left, simulation with our 
light curve of the tested model, with time noted on the abscissa (redshift on model for PAHs turned off; bottom right, fiducial simulation run with ten times 


the bottom, time since the Big Bang on the top). In all panels, the shaded region _ the number of photons. 
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Extended Data Table 1 | Summary of model galaxies 


Model Name Model Purpose M. + (z=2) M, halo (z=2) Mp MDM €b €DM Final Redshift 


Main Model 


Resolution Test 
Resolution Test 


RT Parameter Survey 


Ms and Mpaio refer to the stellar and halo mass at z = 2; c) and epy refer to the minimum force softening lengths for baryons and dark matter particles, respectively; and m, and mpwm refer to the baryonic and dark 
matter particle masses, respectively. For model m13m14, RT stands for radiative transfer. 
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The diurnal cycle of water ice on comet 
67P/Churyumov-Gerasimenko 


M. C. De Sanctis!, F. Capaccionil, M. Ciarniello!, G. Filacchione!, M. Formisano!, S. Mottola’, A. Raponi’, F. Tosi’, 
D. Bockelée-Morvan’, S. Erard?, C. Leyrat®, B. Schmitt*, E. Ammannito!°, G. Arnold’, M. A. Barucci®, M. Combi®, M. T. Capria’, 
P. Cerroni!, W.-H. Ip’, E. Kuehrt?, T. B. McCord, E. Palombal, P. Beck*, E. Quirico* & the VIRTIS Team* 


Observations of cometary nuclei have revealed a very limited 
amount of surface water ice’’, which is insufficient to explain 
the observed water outgassing. This was clearly demonstrated on 
comet 9P/Tempel 1, where the dust jets (driven by volatiles) were 
only partially correlated with the exposed ice regions*. The obser- 
vations®’ of 67P/Churyumov-Gerasimenko have revealed that 
activity has a diurnal variation in intensity arising from changing 
insolation conditions. It was previously concluded that water 
vapour was generated in ice-rich subsurface layers with a transport 
mechanism linked to solar illumination’”, but that has not hith- 
erto been observed. Periodic condensations of water vapour very 
close to, or on, the surface were suggested*” to explain short-lived 
outbursts seen near sunrise on comet 9P/Tempel 1. Here we report 
observations of water ice on the surface of comet 67P/Churyumov- 
Gerasimenko, appearing and disappearing in a cyclic pattern that 
follows local illumination conditions, providing a source of loca- 
lized activity. This water cycle appears to be an important process 
in the evolution of the comet, leading to cyclical modification of the 
relative abundance of water ice on its surface. 

The Visible Infrared and Thermal Imaging Spectrometer VIRTIS”® 
has collected data of high spatial (7-25 m per pixel) and spectral 
resolution since the Rosetta spacecraft approached the nucleus of 
comet 67P/Churyumov-Gerasimenko in August 2014. The reflec- 
tance spectra, taken in different areas over the illuminated regions of 
the comet’s nucleus, show a broad absorption band at 2.8-3.6 um, 


attributed to organic compounds. The absence of pure water ice 
absorption bands indicates an upper limit of about 1% (by volume) 
of water ice, in very limited surface regions, at VIRTIS resolution’. 
Figure 1 shows a small region of the ‘neck’ (longitude 325° + 4° E, 
latitude 31° + 5°N, called Hapi) of the comet, located between the 
small and large lobes of the nucleus, observed at different rotational 
phases after one or more comet rotations. During each rotation, this 
region moves into the shadows projected by the head (the smaller lobe) 
bulge. VIRTIS observes variations in the absorption band near 3 [1m as 
this region moves out of the shadow and becomes illuminated (Fig. 2a). 
In this case we observe a clear alteration of the organic compounds 
band, with a broadening, a shift towards shorter wavelengths and a 
strong increase in depth with decreasing illumination. The band centre 
shifts from 3.22 + 0.05 um to 3.12 + 0.05 1m, its short-wavelength 
shoulder shifts from 2.82 + 0.05 lum to 2.71 + 0.05 um, and the band 
depth relative to the continuum increases from 0.20 + 0.03 to 
0.60 + 0.03. We also observe a flattening of the continuum slope 
and a reduced thermal emission as the 3-j1m band depth increases. 
The stronger 3-m band and the flattening of the continuum sug- 
gest exposed water ice, in addition to the organic material ubiquitously 
present on the comet’s surface*. The spectral ratio (Fig. 2) between the 
spectrum of a pixel close to the shadow and the spectrum of an illu- 
minated pixel shows the presence of a strong ice band extending from 
2.7 [um to 3.6 Lum, while the other water ice bands at 1.5 um and 2.0 um 
are not detected" (see Extended Data Fig. 1). The 3-um water band is 


Figure 1 | Images of the ice-rich area. a, Rosetta 
Optical Navigation Camera (OPNAV) context 
image of the region under study (red box); 

b-e, VIRTIS image at 0.7 um of the region in the 
red box ofa. The data in b, c and d were acquired on 
12, 13 and 14 September 2014, respectively. The 
VIRTIS data in b and c are separated by ~12h, 
corresponding to ~1 comet rotation, while the data 
in c and d are separated by 37.3 h, corresponding 
to ~3 comet rotations. The coloured dots in 

b indicate the zones from which the spectra in Fig. 2 
are taken. Panel e is the same as d, but stretched 
to see the jet (white arrows). 
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Figure 2 | Spectra of the ice-rich areas. a, Spectra from Fig. 1b going from 
illuminated pixels to shadow. Black, red and cyan spots in Fig. 1b correspond 
to black, red and cyan spectra, respectively, taken at steps of 1 pixel. At 
wavelengths >3.5-3.7 um, the spectra show smaller thermal emission for the 
pixels closer to the shadow line. The retrieved temperatures are 175 + 8K, 
184 +5 K and 195 + 4K for the cyan, red and black spots, respectively. 

b, Spectral ratio of the cyan and black spectrum of a (solid line) and a synthetic 
spectrum of water"! (grain size of 10 jum, dashed line). Instrument filters are 
indicated by grey bars. 


clearly present in all the pixels located at the border of the shadowed 
region. The same region has been observed again one and four nucleus 
revolutions” later (Fig. 1b-d) under slightly different illumination 
conditions, as shown by the shadows which cover different areas. 
Nevertheless, in each observation, the presence and change in the 
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water absorption band depth is controlled by the shadow location 
and not by the specific surface region. VIRTIS observes areas in which 
the spectra display progressive 3-j1m band weakening as the region 
moves into greater illumination. Shadowed areas are slightly different 
from the illuminated regions in the VIRTIS observations (Fig. 1b-d), 
and ice-rich and ice-free pixels change according to their distance from 
the shadow: ice-free pixels in the first observation (Fig. 1b) show the ice 
signature (‘ice-rich’) on the following observation (Fig. 1c, d) where 
they are now closer to the shadow. 

Using optical constants'*'* and scattering theory’, we modelled the 
spectra as an intimate mixture of ice and a dark non-ice component 
(Methods, Extended Data Figs 1 and 2) and derived water-ice abund- 
ance maps (Fig. 3). The fit of the spectra requires a relative abundance 
of up to 10-15% of water ice intimately mixed with the non-ice 
component, as shown in the ice distribution maps (Fig. 3a). The maps 
indicate that the maximum quantity of ice is found very close to the 
shadows in all the observations, even if the pixels close to the shadows 
are at a different location on the comet surface. 

The spectra also show differences at wavelengths longer than 3.6 um, 
owing to variations of the thermal emission as the region moves into 
reduced solar illumination, with a correlation of a stronger 3-jm band 
with a weaker thermal emission, thus with lower temperatures. The 
nucleus surface temperatures were retrieved from the long-wavelength 
portion (>4.5 1m) of the VIRTIS spectra’®. The temperatures retrieved 
for the pixels near the shadows are at most 160 K with an uncertainty of 


Figure 3 | Iceand temperature maps. a, Ice maps: 
ice abundance (by volume) as shown by the 
colour scale under. The black pixels are those in 
shadow or corresponding to sky. Isolated bright 
pixels are due to instrumental artefacts. 

b, Temperature images. White colours correspond 
to high temperatures while red-brown colours 
correspond to low temperatures (see colour scale 
under). ¢, Scatter plot of ice abundance versus 
temperature. The data points are extracted from 
a and refer to the region of the neck near the 
shadow. The error associated with the ice amount is 
20%, and errors on the temperatures are ~3% 
above 170 K and ~5% below 170K. 
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*+10K, while the temperature of the well illuminated areas are up to 
210 K with an uncertainty of +2 K (Fig. 3b). The pixels showing the 
strongest 3-11m absorption have temperatures in the range 160-180 K 
and a derived ice content varying in a range between 5+ 1% and 
14 + 3% (Fig. 3). The measured temperatures for these pixels are con- 
sistent with the presence of water ice in the nucleus outer layers. Water 
ice sublimation pressure’’ varies by three orders of magnitude from 
temperatures in the shadowed areas (<1 nbar), to temperatures in the 
illuminated areas, where the activity occurs. A clear anti-correlation of 
ice abundance with temperatures is seen (Fig. 3c) in the regions near 
the shadows of Fig. 1b, indicating that the temperature is cold enough 
to maintain ice at, or near, the surface in the shadows. 

Most of the activity observed in the period August-September 2014 
is from the comet neck®’, where the region previously examined is 
located (Fig. 1a). Activity from this specific region is seen in all the data 
presented here, especially in the data acquired last (Fig. le), indicating 
that a source of water exists to sustain the flux. The amount of water 
flux coming from the superficial ice documented by VIRTIS represents 
~3% of the total water flux measured by MIRO® (see Methods for 
calculations). From this, we can extrapolate that a much larger area is 
affected by the same mechanism (although this larger area is not 
directly observed) and is thus contributing to a larger amount of total 
water flux. Indeed, all the Hapi region is subjected to this diurnal 
shadowing effect which can lead to the outgassing over much larger 
areas. It must be mentioned that the contribution to the total 
outgassing from these surface layers sources is limited in time. This 
is demonstrated by the progressive decrease of the abundance of 
deposited water ice in pixels exposed for a longer time to the solar 
illumination (see Fig. 3). In the case studied here, the presence of 
surface ice close to the location of the jets indicates that the outgassing 
source is likely to be in the uppermost layers of the surface. 

The above description of VIRTIS observations of ice sublimating in 
this neck region (Hapi) when an area emerges from shadow, and the 
progressive decrease of the ice abundance away from the shadow, 
clearly indicates that a cyclic sublimation—condensation process is at 
work during each comet rotation. 

Two possible mechanisms for the cyclic condensation of water on 
unilluminated areas can be considered: (1) the condensation of water 
vapour present in the coma, and/or redeposition of icy grains, on cold 
areas on the nucleus”, or (2) the direct condensation of gas sublim- 
ating from the subsurface under appropriate thermodynamic condi- 
tions’? 4, The first case could indeed occur in the region of the neck 
where, because of the large concavity, sublimated molecules from an 
illuminated region could impact and condense on nearby non-illumi- 
nated areas. However, this mechanism seems to be more efficient at 
small heliocentric distance when gas production rate is high enough to 
enhance the re-deposited flux significantly”. The second mechanism 
has been already suggested to explain the outbursts observed by the 
Deep Impact mission on comet 9P/Tempel 1 that appear to occur near 
sunrise on a particular area’, while extensive subsurface sources were 
invoked to explain the overall ambient outgassing as the observed area 
of exposed pure ice has a too-limited extent*. In addition, during the 
fly-by of 103P/Hartley 2, the DIXI mission has revealed surface ice 
notably only along the morning terminator, suggesting diurnal effects”. 

The VIRTIS observations are now able to demonstrate that a mech- 
anism similar to (2) above is at work; our thermo-physical model of 
the nucleus, along with previous literature’*””*, enables the diurnal 
cycle of water to be quantified. When the surface is illuminated, water 
ice sublimates mainly from the uppermost surface layers (Fig. 4 and 
Extended Data Figs 3 and 4). When the surface goes into shadow (or 
into the night side), a temperature inversion occurs between the now 
colder surface layer and the interior layers which maintain a higher 
temperature for a longer time; the magnitude of this process is con- 
trolled by the duration of the shadow/night period and by the thermal 
inertia of the material and extends to a depth defined by the thermal 
skin depth. In the present case, the thermal inertia is constrained by 
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Figure 4 | Temperature and water vapour versus time. a, Simulated 
temperature at different depths from the nucleus surface to the interior (see key 
at right), over one rotational period, for a small region of the neck near the 
shadows. b, Simulated flux of water vapour versus time over one rotational 
period. The simulation has been done using the ‘Rome’ model’*”* (Methods) 
and the parameters reported in Extended Data Table 2.The grey bars 
correspond to unilluminated periods of time due to the combination of comet 
rotation and shadows. 


independent measurements*® and gives rise to a thermal skin depth of 
the order of few centimetres (Fig. 4 and Extended Data Fig. 4). 

Thus, within the few centimetres affected by the heat exchange, 
water vapour still produced by subsurface sublimation in the warmer 
subsurface layers flows through the pores and could re-condense if the 
thermo-physical conditions of the colder upper surface layer allow it. 
By this mechanism the surface layer becomes enriched in water ice. 
The water ice in the uppermost surface layers will be stable until a new 
cycle of solar illumination starts which will increase the surface tem- 
perature and thus trigger again the outgassing of water from the comet 
(Extended Data Fig. 3). 

We thus suggest that the cyclic sublimation—condensation of ice 
triggered by varying illumination conditions is a general process acting 
on cometary nuclei. This process implies the cyclical modification of 
the relative abundance of water ice on the surface of the comet, con- 
tributing to the local water activity. This mechanism could lead to 
differential erosion of the surface, producing morphological differ- 
ences or enhancing prior existing inhomogeneities. This mechanism 
could also contribute to activity arising from the pits on comet 
67P/Churyumov-Gerasimenko”, being relevant to the shadowing 
effects in the pits. 

Moreover, the surface erosion of 67P/Churyumov-Gerasimenko 
could keep the water ice close to the surface, thus avoiding fading of 
activity with time. Finally, as this comet moves towards its perihelion, 
which will be reached in August 2015, the solar radiation will increase 
with the inverse square of the heliocentric distance, and thus this 
process of water sublimation—condensation will become progressively 
more energetic, possibly leading to the formation of outbursts events 
like those observed on the nucleus of 9P/Tempel 1. 
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METHODS 


Data. VIRTIS observations were acquired in September 2014 when the Rosetta 
spacecraft was orbiting at a distance of about 27 km from comet 67P/Churyumov- 
Gerasimenko’s surface resulting in a spatial resolution on the ground of about 7 m 
per pixel. During this period the instrument was observing the morning hemi- 
sphere with a solar phase of about 60-70°. VIRTIS-M-IR data were acquired in 
scan mode with an integration time of 3 s. The characteristics of the data here used 
are reported in Extended Data Table 1. 
Spectral modelling. In order to model nucleus surface spectra, a solution of the 
radiative transfer equation in a particulate medium must be applied. In this paper 
we adopt the Hapke model’’. Our analysis is performed on normalized spectra in 
order to rule out the effect of surface roughness and minimize photometric issues. 
We model the icy regions of the comet, following the equations below, as formed 
by a two end-members regolith made of a ‘dark terrain’ (DT) and water ice. The 
DT represents the average spectrum of the comet surface after photometric cor- 
rection. Water ice is modelled as in ref. 27, starting from optical constants which 
are obtained from refs 13-16 to cover the VIRTIS wavelength range. 

We investigated two mixing modalities: areal and intimate. Areal mixtures are 
obtained from a linear combination of the reflectances (r) of water ice and DT: 


rep =ffr,0 + (1—f)rpr (1) 


where fis the relative amount of water ice and ref is the effective reflectance of the 
medium. Intimate mixture (‘salt and pepper’) is modelled as a linear combination 
of the two end-members single scattering albedoes (w) and is given by: 


Wert =fWH,o + 1 —f)wpr (2) 


where the derived wef is used to compute the final reflectance. In both cases, along 
with the amount of water, the grain diameter d is also retrieved. 

The modelling of the observed spectra is performed by a retrieval procedure that 
searches for the minimum of the reduced chi-square (v7), namely the best fit 
between the modelled (7) and the observed (r°) reflectance: 


Nop? —r™\? 4 
2 me 
y 3 
a | ; ) SOF 3) 


where j identifies each band (A), N is the total number of bands, DOF are the 
degrees of freedom. 

The observed spectra are corrected for spikes and instrumental artefacts. Among 
the residual sources of error, like stray light and signal from the coma, instrumental 
noise is the main contribution to the error of the measured reflectance (¢;). 

In Extended Data Fig. 1 we report, as an example, the result of a typical spectral fit 
for a pixel exhibiting a certain amount of ice obtained in the intimate and the areal 
modes. It can be noted that intimate mixing provides the best match with measured 
spectra. The reason is that areal mixing increases the relative depth of 1.5-m and 


2-um absorption bands with respect to the 3-j1m feature, while they are very weak or 
absent across the data set investigated in this work. Given this, the water ice abund- 
ance maps are obtained modelling the spectra as intimate mixtures. 

The non-detection of the water ice bands at 1.5 um and 2.0 um indicates that the 
water ice and non-icy components are intimately mixed. In fact an areal mixture of 
1% of water ice and 99% of non-ice materials also yields spectra with well-defined 
absorption features at 1.5 and 2 um as well as an increase in reflectance (Extended 
Data Fig. 1). This was the case with comet 9P/Tempel 1, where ice-rich patches on 
the nucleus have been modelled through an areal mixture containing 6 + 3% ice’. 

The maps showing the abundance of water ice (Fig. 3) are produced by setting a 
threshold of 50 on the median S/N of the spectra in order to avoid pixels in shadow 
or not of the nucleus of the comet. 

In Extended Data Fig. 2, we report the spectra of three consecutive pixels going 

out of the shadow, and thus with decreasing abundance of water ice. The relative 
accuracy of the parameters obtained from the spectral modelling (water ice 
amount and grain diameter) is of the order of 20%, due to the instrumental noise 
and the uncertainty on the level of the dark terrain. 
Estimation of water flux. The contribution to the total outgassing was estimated 
using only the extent of ice in our data. We calculated the surface area in the data 
that contains the transient ice and the percentage of ice in such an area. Using that 
information, and the pixel size, we computed the equivalent area covered by pure 
ice. In the VIR data, this area is ~1 km?. 

MIRO measured about 10*° molecules s* and they estimated that between 0.1 

and 1% of the 67P nucleus surface is needed to explain the water gas production 
rates if water ice were located on the surface°. Thus, using these values we can say 
that the ‘transient ice’ seen by VIRTIS contributed ~3% to the total water flux. 
However, this is the lower limit of the contribution. In fact we can extrapolate that 
a similar fraction of the neck region, even if not observed by VIR due to obser- 
vation conditions, is subject to a similar diurnal illumination effect and could have 
ice deposits like the imaged area. 
Code availability. The code used to generate the thermal models of comet 67P isa 
direct implementation of a published model”***”’. The code used to generate the 
spectral fit is described above. The code used to retrieve the nucleus temperatures 
of comet 67P is a direct implementation of a published method’. 


27. Ciarniello, M. et al. Hapke modeling of Rhea surface properties through Cassini- 
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Churyumov-Gerasimenko, the new Rosetta target. Astron. Astrophys. 444, 
605-614 (2005). 

29. Capria, M.T., Coradini, A. De Sanctis, M. C. & Blecka, M. |. P/Wirtanen thermal 
evolution: effects due to the presence of an organic component in the refractory 
material. Planet. Space Sci. 49, 907-918 (2001). 
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ssd.jpl.nasa.gov/sbdb.cgi (NASA/Jet Propulsion Laboratory). 
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Extended Data Figure 1 | Spectral fit. a, b, Two mixing modalities, intimate spectra are normalized with respect to 7) = 1.8 um. For the areal mixture case, 
(a) and areal (b), are used to model the same spectrum; the spectrum is the modelled absorption band at 2 um is relevant, even with the small 
identified by its position in the acquired image (‘sample’ and ‘line’) and its abundance (fy20) and grain diameter (dyj20) we retrieved. This implies a 


spacecraft event time (scet). The three missing parts of the spectra are related to _ worse fit, as indicated by the larger 7’ variable. The intimate mixture is thus a 
the wavelength ranges covered by the junctions of the filters which produce better model of the spectra. 
significant artefacts. They are thus removed during the fitting procedure. The 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Normalized Reflectance Normalized Reflectance 


Normalized Reflectance 


1.2 


0.8 


0.6 


1.0 


0.8 


0.6 


scet = 00369167614 
sample = 120 
line = 62 


observed 


intimate mixing 
fino = 7.9 + 1.4 % 
doo = 2.0+ 0.5 pm 


a a 


1.5 2.0 2.5 3.0 3.5 
Wavelength (um) 


te 
3 scet = 00369167614 

k sample = 120 

line = 63 


observed 


L ~ 


intimate mixing 
fro = 3.54 0.5 % 
dyoo = 1.9+0.4 pm 


- b 


LS 2.0 2.5 3.0 35 
Wavelength (Um) 


} scet = 00369167614 


L sample = 120 
| line = 64 

r observed 
r modeled 


intimate mixing 
fyso = 1.6 + 0.3 % 
dio = 1.94 0.4 pm 


[ C 


1.5 2.0 2.5 3.0 3.5 
Wavelength (Um) 


Extended Data Figure 2 | Spectral fits of comet nucleus spectra with 
different ice content. From a to c, the depth of the absorption band at 3.2 1m 
decreases and the band centre moves slightly towards longer wavelengths. 

In all cases the spectra are well modelled with a decreasing amount of water ice 
(fa20) and a constant grain diameter (dyj29) of that water ice. Missing parts 
of the spectra and normalization are as for Extended Data Fig. 1. 
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During the day, water In the night, in low inertia 
~_®  vapour percolates through bodies, water vapour is 
pores and sublimates away trapped in surface layers 


In the early morning, the first 
material to sublimate is the ice 


Only later, depending on the 
specific thermal inertia and physical 
properties of the nucleus, the 
thermal wave reaches the deeper 
layers and the overall mechanism 
starts again 


Extended Data Figure 3 | Diurnal cycle of water. In the case reported here, _ night or in the shadow), it condenses and is trapped as ice. The subsequent 
the sublimation of water vapour takes place in a deeper layer (see cartoon illumination of the surface leads to absolute loss of the condensed water 

for a graphic explanation). The water vapour filters up to the surface layers vapour. This is an effective mechanism of transport of H,O from deeper layers 
(which are essentially dehydrated, as during the day we do not see any spectral _to the surface. 

signature of ice) where, finding lower temperature conditions (as in the 
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Extended Data Figure 4 | Temperature and water vapour profiles. 
a, Temperature profiles, and b, water vapour pressure profiles, from the nucleus 
surface to the interior at different times: the blue curve is the profile when the 


area is illuminated; the red curve is the profile obtained 6 min after passing into 
shadow; and the purple curve is about 40 min after passing into shadow. 
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Extended Data Table 1 | Characteristics of VIRTIS observations 
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Extended Data Table 2 


Main parameters used in the comet model 


Physical Quantity 
Rotation Period 
Semi-mayor Axis 
Eccentricity 
Albedo 

Emissivity 


Dust (Silicates) /Ice 


Dust (Organics)/Ice 


| Mean Pore Radius 


Porosity 


Hertz Factor (the area of contact 
between material grains relative to the 
cross-sectional area) 


Silicatic Dust Thermal Conductivity 


Organic Dust Thermal Conductivity 


H,0 (crystalline) thermal 


conductivity 


Mean Density 


| Thermal Inertia 


Initial Temperature 


References are Sierks ef al.’, Mottola et al. 
JPL Small-Bodies*” 


References cited in this table are as follows: refs 7, 12, 22, 24, 28, 29, 30. 
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Adopted for this model and 
typically used in cometary 
thermophysical models 


Adopted for this model and 
typically used in cometary 
thermophysical models 


[22,24] 
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oO n Ko) So 
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Adopted for this model and 
typically used in cometary 
thermophysical models 


[22,24] 
3([WK' m] [22,2429] 
0.25[WK mm] 


567/T [W K° mm] 


400 [Kg m-3] 


[22,24,29] 


[22,24,29] 


Resulting from the 
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50-70 [J m-2 K-1s-0.5] Resulting from the 

parameters used 


100 [K] Adopted for this model and 
typically used in cometary 
thermophysical models 
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Controlling neutron orbital angular momentum 


Charles W. Clark', Roman Barankov’, Michael G. Huber®, Muhammad Arif?, David G. Cory*?"°7 & Dmitry A. Pushin®*® 


The quantized orbital angular momentum (OAM) of photons’ 
offers an additional degree of freedom and topological protection 
from noise. Photonic OAM states have therefore been exploited in 
various applications”’ ranging from studies of quantum entangle- 
ment and quantum information science*” to imaging* '*. The OAM 
states of electron beams’** have been shown to be similarly useful, 
for example in rotating nanoparticles and determining the chirality 
of crystals'‘® °. However, although neutrons—as massive, penetrat- 
ing and neutral particles—are important in materials characteriza- 
tion, quantum information and studies of the foundations of 
quantum mechanics, OAM control of neutrons has yet to be 
achieved. Here, we demonstrate OAM control of neutrons using 
macroscopic spiral phase plates that apply a ‘twist’ to an input neut- 
ron beam. The twisted neutron beams are analysed with neutron 
interferometry. Our techniques, applied to spatially incoherent 
beams, demonstrate both the addition of quantum angular 
momenta along the direction of propagation, effected by multiple 
spiral phase plates, and the conservation of topological charge with 
respect to uniform phase fluctuations. Neutron-based studies of 
quantum information science””', the foundations of quantum 
mechanics’, and scattering and imaging” of magnetic, supercon- 
ducting and chiral materials have until now been limited to three 
degrees of freedom: spin, path and energy. The optimization of 
OAM control, leading to well defined values of OAM, would provide 
an additional quantized degree of freedom for such studies. 

OAM is associated with rotation of a particle about a fixed axis. The 
axial particle currents are encoded in the spiral phase profile of the 
particle’s wavefunction. The component of OAM parallel to the axis 
of rotation is quantized in integer multiples of the reduced Planck 
constant h. The quantization of OAM is a consequence of the wave- 
function having a single value, which implies that it is a periodic func- 
tion of the rotation angle, with a period of 2m radians. When 
interactions of the particle with its environment are symmetric with 
respect to such rotations, OAM is conserved. Changes in OAM can be 
effected by putting a ‘twist’ on the wavefunction. To achieve this with 
neutrons, we use a macroscopic phase plate made in the shape ofa spiral 
staircase, which matches the phase profile of an OAM state. Neutrons 
passing through such a spiral phase plate (SPP) obtain axial rotation 
around the direction of the beam (the quantization axis), dictated by a 
specific design of the plate, as we discuss below. Similar techniques have 
been applied previously to light”, X-rays” and electrons”. 

The production and detection of neutron OAM states is difficult for 
two reasons. First, even for the cold neutrons used in our experiments, 
it is almost impossible to get two-dimensional (2D) neutron imaging 
detectors with resolution better than the micrometre-scale neutron 
coherence length. Second, our neutron source is a nuclear reactor 
with thermal and cryogenic moderators, spectrally filtered with 
Bragg-scattering monochromators. This makes it difficult to produce 
a spatially coherent neutron beam and directly detect an eigenstate of 
OAM. We estimate that the transverse coherence length of the neutron 
beam in our experiments is in the range from 60 nm (ref. 26) to a few 


micrometres (ref. 23) depending on the transverse direction within the 
beam profile, while the beam diameter is about 15 mm. 

The input to the interferometer (Fig. 1a) contains a mixture of OAM 
states and is spatially incoherent over the transverse displacement of 
the neutron paths. We adopt the strategy of using neutron interfero- 
metry to demonstrate coherent control of self-interfering neutrons. At 
any given time there is at most one neutron in the interferometer. How 
an SPP changes the angular-momentum composition of a wave- 
function ¥, can be thought of as follows. Upon transmission through 
the SPP, Y is simply modulated by the transmission amplitude 
Y— exp(i0) Y, where 0 is the phase function of the SPP. Figure 1b is 
a schematic diagram of an SPP, and Fig. 1c isa photograph of two such 
plates (see details in Methods). As shown in Fig. 1b, the thickness of a 
SPP, h, varies with the angle, g, so that h = ho + h,g/(2m), where ho is 
the base height of the SPP and h, is the step height. Interaction of 
low-energy neutrons with materials can be described using an optical 
potential’*, Within this formalism, the phase shift with respect to 
vacuum of a neutron passing through an SPP is: 


0= —Nb.jh= —NbcA( ho +h, =) (1) 


where N is the atom density and b-. is the coherent scattering length 
of the material composing the phase plate, and 2 = 0.271 nm is the 


a 
2D detector 


rotation axis 


Phase flag 


Blade 2 


Neutron 


interferometer 


Figure 1 | Schematic diagram of the neutron interferometer. a, The input 
neutron beam is coherently split into two coherent paths by the Bragg 
diffraction at blade 1. Blade 2 serves as a neutron mirror and blade 3 recombines 
the neutron paths and directs them to two neutron detectors: an integrating 
counter and a two-dimensional imaging detector. The neutron counts recorded 
by the two detectors contain information about the relative phase of the neutron 
wavefunction accumulated along the two separate paths. The phase flag is a 
2-mm-thick fused silica plate. By positioning and rotating the phase flag we can 
adjust a uniform phase difference between the neutron paths inside the 
interferometer. The SPP is placed in one path to produce a spatial phase 
distribution across the neutron wavefront. b, Schematic diagram of the SPP. The 
step height of the spiral, h,, is chosen to match a 27 phase shift difference between 
a path passing the plate and the reference path in air. c, Photographs of actual 
SPPs. Two SPPs that produce a 47 phase shift (h, = 224 tm), photographed with 
the spiral staircase rising clockwise on the front face: one is 10 mm in diameter 
(top), the other is 15 mm in diameter (bottom). 
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neutron wavelength. To attain a 27 phase shift in an aluminium 
SPP, the step height is h, = 112 tm. Note that this value of h, is much 
greater than our neutron wavelength, since the index of refraction, n, 
for neutrons at this wavelength in aluminium is very close to unity”: 
1—n=Nb.J’/(2n) ~ 2.43 X 10°. Figure 1c shows the photographs 
of two SPPs with h, = 224 um. 

Immediately after passing the SPP a neutron acquires an addi- 
tional azimuthal phase twist of 2mL, where the effective angular 
momentum L = Nb,Ah,/(21) is a function of the spiral step height. 
Every OAM component of an incoherent beam entering the phase 
plate obtains an additional phase twisting proportional to L. The 
average angular momentum of the neutron, measured in units of hi, 
becomes (L) = Lo + L, where Lp is the initial average OAM, which 
varies for different neutrons of the incoherent beam. Given the 
uncertainty of the angular momentum with which a neutron enters 
the phase plate, the outgoing states also remain uncertain. 

The neutron OAM states that we generate are analysed using inter- 
ferometry. We fabricated several SPPs, corresponding to phase circula- 
tion of 27, 4m, 8m and 157 around the singularity of the wavefunction, 
or average orbital momenta of L = 1, 2, 4 and 7.5. The input neutron 
wavefunction is split along two paths in the neutron interferometer. 
Both neutron paths pass through a 2-mm-thick plate (the phase flag), 
with which we control the relative overall phase difference between 
paths. One neutron path passes through an SPP, which imprints an 
azimuthal phase upon the neutron wavefront. Up to a common overall 
phase, the neutron wavefunction at the entrance of the 2D detector can 
be described in cylindrical coordinates centred at the axis of the phase 
plate as: ¥ =(ce/? + ce— 0) Wo, where g is the azimuthal angle 
about the beam propagation axis (see Fig. 1b), c, and c, are amplitudes 
composed of neutron reflection and transmission coefficients of the 
interferometer blades, # is the phase due to the phase flag and Wp is the 
wavefunction of a neutron entering the interferometer. The spatially 
resolved neutron fluence rate at the 2D detector is thus proportional to: 


Lp (p, 9; 0) x [1 + Ccos(Lp + $o)]| Yol” (2) 


where 0 = C= 1 is the interferometric contrast of the interferometer. 
For the interferometer used in this work, C = 0.84 without background 
correction. 

Figure la is a schematic diagram of the experiment performed at 
the Neutron Interferometry and Optics Facility (NIOF)**” at the 
National Institute of Standards and Technology (NIST). The neutron 
interferometer shown there has the standard configuration used in 
the field”: it is analogous to an optical Mach-Zehnder interferometer. 
Created in a 20-MW research reactor, cooled in a cryogenic mod- 
erator to a temperature of 20K, and transported through 30m of 
neutron guides, the neutrons enter the neutron interferometer in a 
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Figure 2 | OAM interferograms. Spatial 
distribution of the neutron counts in the 2D 
detector” of the neutron interferometer for four 
SPPs, with values of L = 1, 2, 4 and 7.5, as labelled. 
The horizontal and vertical positions on the 2D 
neutron detector are shown in millimetres. For the 
integer values of L these distributions display the 
simple OAM interference pattern expressed in 
equation (2); for L = 7.5 we have the superposition 
of OAM modes given by equation (4) in 
Methods. The 2D detector is a centroid-type event- 
counting detector with a spatial resolution of 

100 um and an 18% quantum efficiency (that is, 
counts registered per neutron incident on the 
detector). Its operation is shot-noise (Poisson- 
noise) limited in this regime. The neutron counts 
were collected over 3.5 days and normalized by the 
maximal pixel count, which is about 45. 


10 gat 
15-mm-diameter beam, which is Bragg-scattered on the first blade of 
the neutron interferometer into two coherent paths. We insert one or 
two SPPs into one of the paths, so a neutron in that path acquires a 
variation of phase across its wavefront. The second blade of the neut- 
ron interferometer is employed as a lossy mirror: part of the incident 
beam is transmitted through the blade and leaves the neutron inter- 
ferometer (this part is not shown in Fig. la); the remainder of the 
beam is Bragg-scattered towards the third blade of the neutron inter- 
ferometer. The two paths from the second blade reconnect coherently 
at the third blade. This blade combines the interfering transmitted/ 
Bragg-diffracted neutron paths and directs them into the 2D 
detector” and an integrating “He neutron counter”. A 2-mm-thick 
plate of fused silica is installed between the second and third blades. 
This serves as a ‘phase flag’: by rotating this phase flag we can intro- 
duce and control a spatially uniform phase difference between neut- 
ron paths. The spatially resolved data from the 2D detector provides 
information on the spatial phase ‘impressed’ upon on the neutron 
wavefront by the SPP. 

Figure 2 shows 2D images using a position-sensitive detector placed 
after the third blade of the neutron interferometer (see Fig. 1a). These 
interferograms use a conventional false-colour spatial representation of 
the time-integrated neutron intensity per pixel of the 2D detector 
(examples of the raw data, data manipulation and data noise are 
shown in Extended Data Figs 1-3). The results generated by SPPs with 
h, = 112 tm, 224 um, 448 tum and 840 um, which correspond to L = 1, 
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Figure 3 | Addition of angular momenta along the direction of propagation 
accomplished by two spiral phase plates. On the left are two interferograms 
obtained separately with different SPPs: with effective OAM L, = 1 

(A, = 112 um) and with Ly, = 2 (h, = 224 um). The horizontal and vertical 
positions on the 2D neutron detector are shown in millimetres. On the right is 
the interferogram obtained by concatenating both SPPs in path I of the neutron 
interferometer (in the configuration indicated schematically), to make a 
compound SPP with L, = L, + L, = 3. 
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L =3,$) =-0.625° 
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7.5 12.5 


17.5 7.5 12.5 17.5 


Figure 4 | Rotational invariance. Interferograms obtained using the com- 
pound SPP with L = 3, as described in Fig. 3 for different positions of the phase 
flag (see Fig. 1a). The horizontal and vertical positions on the 2D neutron detector 
are shown (in millimetres). The three images were taken at different settings 

of the phase flag, corresponding to the values dy = —0.625°, 0° and 0.3215°, as 
indicated. The image rotates by the angle proportional to the phase flag. The 
average OAM is independent of the phase flag position, that is, L is preserved. 


2, 4 and 7.5, are shown in Fig. 2. If the effective thickness L is close to an 
integer value, it follows from equations (4) and (5) in Methods that the 
interferogram is dominated by the characteristic pattern corresponding 
to the OAM | ~ L. Thus for integer values of L we see the expected sharp 
angular distribution of neutron intensity, dominated by the contri- 
bution from / ~ L, while for non-integer L we have a distributed super- 
position of contributions from numerous values of |. We find that the 
action of multiple SPPs follows the familiar rule for addition of angular 
momenta. If we concatenate two SPPs with step heights corresponding 
to different OAMs L, and Ly, as shown in Fig. 3, the net transfer of OAM 
to the neutron beam is L, = L, + Ly. Figure 3 thus provides an experi- 
mental demonstration of the elementary proposition 1+ 2=3. 
Moreover, the composite angular momentum L, transforms as a true 
angular momentum under rotations of the composite phase plate struc- 
ture about its symmetry axis, as well as with respect to interference with 
the beam in the other arm of the neutron interferometer (Fig. 4). 

In conclusion, we have demonstrated control of OAM of neutrons 
using easily manufactured macroscopic SPPs. The average OAM of the 
beams has been measured using a perfect-crystal neutron interferometer. 
The interferometric experiments exemplify the celebrated particle-wave 
duality of neutrons: on the one hand, neutrons are detected as individual 
particles, while on the other, neutrons traverse space like waves, carrying 
quantized values of OAM. Our interferograms indicate that each of these 
individual states has its OAM changed by the same amount when passed 
through a SPP of integer order. We showed that the action of two phase 
plates placed in series increased the OAM by the sum of the OAM 
produced by each phase plate separately. This result is consistent with 
the accepted notion of additivity of OAM, and is also consistent with the 
conservation of vortex topological charge. Possible future directions of 
such studies include optimizing SPPs and experimenting with alternative 
phase-shifting devices, such as the fork-dislocation gratings also used in 
OAM control of electrons and photons; increasing the coherence of the 
input neutron beam; entanglement of neutron spin and orbital angular 
momenta; and angular-momentum-resolved measurements of neutron 
scattering. OAM states may also extend phase-contrast neutron 
imaging” to two dimensions. Even in its nascent form, as demonstrated 
here, neutron OAM control ina single crystal interferometer provides an 
entangling operation between path and OAM. This potentially offers a 
new tool for neutron-based quantum information science and tests of the 
foundations of quantum mechanics”. 

Online Content Methods, along with any additional Extended Data display items 


and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 

Experimental methods. Our experiments were performed at the NIST Center for 
Neutron Research in Gaithersburg, USA. Neutrons are produced by a 20-MW split- 
core reactor moderated with heavy water, and are further cooled to around T = 20 K 
by a liquid hydrogen cold source. The NIOF facility is about 25 m from the reactor 
core, and neutrons are guided to it by a multilayer neutron guide. At NIOF, neutrons 
are extracted from the guide by a pyrolytic graphite monochromator and collimated 
by a set of cadmium slits. The neutron wavelength 2=0.271nm (energy 
E~11meV) used in the experiment was chosen to satisfy the Bragg condition 
(0g = 25.6°) for the (111) crystallographic planes of a crystal silicon interferometer. 
Figure 1a shows the neutron interferometer used in this experiment, which has the 
Mach-Zehnder configuration that is in standard use in this field”. 

The integrating *He detector was used for aligning the system, monitoring of the 
reactor flux, and determining the experimental parameters such as the initial 
interferometer phase (#9) and interferometer contrast (C). This phase and contrast 
can be measured by rotating the phase flag (see angle of rotation in Fig. 1a) inside 
the interferometer with no sample present. During experiments with samples we 
use the integrating detector to monitor reactor neutron intensity. 

The 2D imaging neutron detector*® has a spatial resolution of 100 1m and 
detection efficiency of 18% at the neutron wavelength of 4 = 0.271 nm. The count 
rate over the whole area of the 2D detector was approximately 1.9 neutrons per 
second and a typical measurement time for each image collection was 3.5 days. 
These long collection times necessitate robust phase stability of the neutron inter- 
ferometer platform. The interferometer phase drift was 1° per day and the image 
noise per pixel was statistically limited. Images shown in Figs 2-4 were obtained 
with the 2D detector, filtered with standard ‘average’ filter over 10 X 10 pixels, and 
normalized to a maximal count. The raw typical images of the detector without 
filtering and normalization can be seen in Extended Data Figs 1-3. The integrating 
counter in our experiment is a He detector with nearly 100% efficiency. 

The SPPs were machined out of dowels of Al 6061 alloy by a 5-axis computer 
numerical control milling machine at the NIST machine shop, following two dif- 
ferent procedures. The smaller plates (10 mm diameter) were cut by rotating the Al 
sample and moving the end mill out, while the larger plates (15 mm in diameter) 
were milled in the form of a helical staircase with approximately 200 treads. 

The experimental setup (that is, neutron interferometer with samples and phase 
flag, and the neutron detectors) is located within three nested enclosures 
(Matryoshka-style). To minimize phase drifts during the week-long data collection 
times, the temperature of the innermost enclosure is actively controlled to remain at 
24°C within 5 mK (ref. 26). The middle enclosure is a Faraday cage with temper- 
ature isolation and sound damping. It sits on a 40,000-kg, vibration-isolated table 
suspended on air springs from a platform decoupled from the floor of the reactor 
facility*’. The vibration isolation actively suppresses the mechanical noise spectrum 
above 0.5 Hz and is controlled with micrometre precision. Changing the SPPs 
requires opening of all three nested enclosures, and we found it necessary to wait 
about 24h afterwards for the system to return to equilibrium. 

Theoretical analysis. OAM states of neutrons can be described with the paraxial 
approximation of wave mechanics. Within this approximation, the wavefunction of 
a freely propagating neutron can be written as ®(r) =e" Y(r), where r= (x, y, z) is 
the position vector, and the envelope function ¥(r) satisfies the 2D Helmholtz 


equation: 
om o Ov 
+25) 4+2ik, =0 3 

(sata) et (3) 


where k,~/2mE/h is the wavevector of a neutron with mass m and energy E 
propagating forward along the z axis. OAM states comprise a complete set of 
solutions of this equation’ in the cylindrical coordinates (p,¢,z), where 
x=pcos(y), and y=psin(y). The eigenfunctions take the form 
V(p, 9, Z) = u(p, z)exp(ilg), where the function u(p,z) describes the transverse 
radial structure of the wavefront as a function of the propagation coordinate z. 
The function exp(ilg) is an eigenfunction of the OAM operator, 1, = —ihd/é9, 
with eigenvalue /h, where 1 = 0, +1, +2,... is an integer. In the simplest case when 
1 = 0, the envelope function reduces to a diffracting beam having a Gaussian profile 
in the transverse direction at any axial position z. 

For an arbitrary value of L the phase factor e“? 
superposition of OAM states: 


generated by an SPP describes a 


= fel «) 


1=0,+1,... 


where the amplitudes 3 are given by the overlaps of the phase factor and the OAM 
eigenfunctions: 
By =e) sinc(L—L) (5) 
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where sinc(x) = sin(1x)/(mx). For an OAM beam with zero angular momentum 
passing through the SPP L, the probability distribution of the resulting OAM states 
is provided by w; = ||”. We notice that for a non-integer effective thickness of the 
phase plate, such as L = 7.5, as used in our experiment, a superposition of different 
OAM states would be produced with dominant contributions from / = 7 and/ = 8. 
In general, given the uncertainty of L, even for the values of L = n + SL close to the 
integers n = 0, +1.,..., the resulting state is a mixture of OAM states. 

The distribution of the OAM states in any beam experiment depends upon 
the coherence properties of the incident beam. This issue has recently been 
discussed in the context of electron vortex experiments by Shiloh et al.’’. In 
our experiment, the coherence length in the direction orthogonal to the 
propagation axis is 1.~60nm (the corresponding wavevector component 
k= 1l,~17m7'). Since the beam radius R=7.5mm is much larger than 
the coherence length, the beam is spatially incoherent, and the neutron beam as 
a whole does not have a definite state of angular momentum. 

Future applications and experiments. 

Studies of quantum information science and foundational questions of quantum 
mechanics. OAM states of neutrons could be used to extend the size of the Hilbert 
space available in neutron studies of fundamental quantum phenomena. Neutrons 
have been useful testbeds for quantum information with coherent degrees of 
freedom demonstrated for path, spin and momentum***. Many of the most 
useful demonstrations and applications of quantum information are richer in 
higher dimensions, including quantum error correction”® and contexuality. 
Neutron OAM for materials characterization. Recently demonstrated neutron 
imaging and tomography” paved a way to many applications”, such as studies 
of magnetic materials*”, magnetic domains in bulk materials“ and also super- 
conductors**. OAM states of neutrons may benefit those studies. Indeed, the 0AM 
state may provide a unique probe for samples having embedded chirality such as 
chiral nanoparticles and chiral liquid crystals. The neutron microscope**”’ that is 
being currently developed at the NIST Center for Neutron Research has a potential 
to use neutron OAM beams (similar to ref. 48) to study magnetic materials and 
chiral structures. A combination of OAM and spin states of neutrons may also be 
employed to study skyrmions, which are currently investigated using small-angle 
scattering techniques or magnetic reflectometry. 
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Extended Data Figure 1 | Raw data. Typical images of raw neutron count L = 0 that is equivalent to a uniform phase plate; on the right is an image of the 
data obtained over about 80 contiguous hours of data collection with the 2D | L=3 compound SPP discussed in Fig. 3. The horizontal and vertical positions 
imaging detector (see Fig. 1). False-colour representation of neutron counts on the 2D neutron detector are shown in millimetres. 

per pixel as indicated by scale on image. On the left is an image of an SPP with 
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L=3, raw L=3, filtered L=3, normalized 
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Extended Data Figure 2 | Data processing. Illustrations of steps taken to taken over a 10 pixel X 10 pixel square. c, Filtered data normalized to maximum 


convert raw images collected on the 2D detector to the images shown in Figs 2- _ value of intensity in b. The horizontal and vertical positions on the 2D neutron 
4. a, Raw data. b, Same data, passed through 2D averaging filter with averaging detector are shown in millimetres. 
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L=3, raw L=3, Poisson noise L=3, noise/signal 
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Extended Data Figure 3 | Noise distribution. Illustrations of steps taken to _ square root of each pixel shown in a. c, Noise-to-signal ratio of image in a. The 
model effects of shot noise in the raw images collected on the 2D detector horizontal and vertical positions on the 2D neutron detector are shown in 
and images shown in Figs 2-4. a, Image of the raw data. b, Poisson noise or millimetres. 
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A concise synthesis of (+)-batzelladine B from 
simple pyrrole-based starting materials 


Brendan T. Parr!, Christos Economou! & Seth B. Herzon! 


Alkaloids, secondary metabolites that contain basic nitrogen 
atoms, are some of the most well-known biologically active natural 
products in chemistry and medicine’. Although efficient labor- 
atory synthesis of alkaloids would enable the study and optimiza- 
tion of their biological properties’, their preparation is often 
complicated by the basicity and nucleophilicity of nitrogen, its 
susceptibility to oxidation, and its ability to alter reaction out- 
comes in unexpected ways—for example, through stereochemical 
instability and neighbouring group participation. Efforts to 
address these issues have led to the invention of a large number 
of protecting groups that temper the reactivity of nitrogen’; how- 
ever, the use of protecting groups typically introduces additional 
steps and obstacles into the synthetic route. Alternatively, the use 
of aromatic nitrogen heterocycles as synthetic precursors can 
attenuate the reactivity of nitrogen and streamline synthetic strat- 
egies*. Here we use such an approach to achieve a synthesis of the 
complex anti-HIV alkaloid (+)-batzelladine B in nine steps (long- 
est linear sequence) from simple pyrrole-based starting materials. 
The route uses several key transformations that would be challen- 
ging or impossible to implement using saturated nitrogen hetero- 
cycles and highlights some of the advantages of beginning with 
aromatic reagents. 

The retrosynthetic conversion of a saturated nitrogen heterocycle to 
a heteroaromatic exchanges a reactive, basic functional group with one 
that is lower in energy, non-basic, non-nucleophilic, and more easily 
manipulated. For example, analysis of well-appreciated physical 
organic scales of basicity and nucleophilicity shows that the six- 
membered heterocycle piperidine is much more basic (pK, = 3.1, 
DMSO, ref. 5) and nucleophilic (nucleophilicity parameter N = 18.1, 
H,0, ref. 6) than is the corresponding aromatic heterocycle pyridine 
(pK, = 10.6, DMSO, ref. 7; N = 11.0, HO, ref. 6). Moreover, functio- 
nalized heteroaromatics are readily elaborated by well-established 
carbon-carbon bond-forming reactions, such as cross-couplings. 
In the strategy we pursue here, simple pyrrole-based precursors serve 
as sources of partially or fully saturated nitrogen heterocycles and 
are advanced by carbon-carbon bond-forming and reductive trans- 
formations. This approach complements terpene synthesis and 
biosynthesis, which typically proceeds by oxidation of a complex 
hydrocarbon template’. 

We applied this strategy towards a synthesis of the guanidinium 
alkaloid (+ )-batzelladine B (1, Fig. 1a)’. Structurally, 1 contains a syn- 
tricyclic guanidine (vessel) connected to a bicyclic guanidine (anchor) 
via an alkyl ester. At least 15 batzelladine alkaloids have been iso- 
lated”? and several members of this family inhibit the binding of 
HIV glycoprotein gp120 to human CD4 receptor cells (half-maximal 
inhibitory concentration of 1 is 311M)’, thereby preventing viral 
induction. The absolute stereochemistries of the vessel and anchor 
of 1 were established in refs 13 and 14, respectively, and syntheses 
and synthetic studies of other batzelladines have been reported 
(for selected examples, see refs 15-22). Notably, in ref. 13, a tethered 
Biginelli condensation strategy was developed and this has provided 
access to several batzelladines and related alkaloids”*. In refs 16 and 17, 


enantioselective synthetic routes to (+)-batzelladine A (2) are 
reported, but a route to 1 has not been described. 

We envisioned that the vessel and anchor fragments of 1 (Fig. 1b) 
could be derived from the pyrrole-based precursors 3 and 6, respect- 
ively, if suitable methods for carbon-carbon bond formation 
and controlled reduction in oxidation state could be achieved. A rho- 
dium-catalysed formal [4 + 3] cycloaddition between 3 and a donor- 
acceptor carbene” was proposed to provide entry to the dehydrotro- 
pane 4, which contains all of the functional group handles required for 
synthesis of the vessel fragment. We posited that the pyrrole 6 could 
serve as a precursor to the anchor of 1 by a Mannich addition to form 7 
(ref. 25), followed by cyclization and controlled adjustment of oxida- 
tion state, with concomitant isomerization. 

The N-amidinylpyrrole 3 (Fig. 2a) was prepared in two steps and 
75% yield from commercial reagents (see Supplementary Informa- 
tion). Extensive experimentation was required to realize the formal 
[4 + 3] cycloaddition with high yield and stereoselectivity. Ultimately, 
we found that use of the (S)-pantolactonyl «-diazo ester 9 (ref. 26) and 
dirhodium(1) tetrakis[N-phthaloyl-(S)-tert-leucinate] (Rh,[(S)-pttl],) 
as catalyst (0.5 mol%) provided the dehydrotropane 10 in 93% yield 
and >95:5 diastereoselectivity. Formal cycloaddition between 3 and 
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Figure 1 | Structure and synthetic analysis of (+)-batzelladine B (1). a, The 
chemical structures of (+)-batzelladine B (1) and (+)-batzelladine A 

(2), with embedded pyrrole substructures shown. b, The strategy we used 
produces the vessel and anchor substructures of 1 (5 and 8, respectively) 
from the pyrrole-based starting materials 3 and 6, via the intermediates 

4 and 7. 
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Figure 2 | Synthesis of the vessel fragment of (+)-batzelladine B (1) and 
determination of its stereochemistry. a, Synthesis of the vessel precursor 

17. Reagents and conditions: (1) Rh,[(S)-pttl], (0.5 mol%), pentane, 36 °C, 
93%, >95:5 d.r., or Rhy[(S)-pttl], (0.1 mol%), pentane, 36 °C, 87%, >95:5 d.r.; 
(2) H, (30 atm), CIRh(PPh;)3 (2.0 mol%), i-PrOH, 23 °C; (3) tetra-n- 
butylammonium fluoride (TBAF), TMS-EBX, THF-CH,Cl, (8:1), —78 °C, 


ent-9 using the same catalyst provided 10 with 76:24 diastereoselec- 
tivity (81% yield), demonstrating that the former substrate—catalyst 
pair is stereochemically matched (an example of double asymmetric 
synthesis”’). Formal cycloaddition between 3 and achiral diazoesters 
afforded the corresponding adducts in 45%-93% yield and 60%-86% 
enantiomeric excess (Supplementary Table 1). The yield of 10 was 
essentially unaffected (87%) when the catalyst loading was reduced 
to 0.1 mol%. With this key step accomplished, the pyrroline ring was 
selectively reduced by treatment with chlorotris(triphenylphosphi- 
ne)rhodium under dihydrogen (10 — 11). Exposure of the reduction 
product 11 to n-tetrabutylammonium fluoride and 1-[(trimethylsilyl)- 
ethynyl]-1,2-benziodoxol-3(1H)-one (TMS-EBX)”* at —78 °C formed 
the o.-alkynyl-B-ketoester 12 as a single diastereomer (‘H NMR ana- 
lysis). The first three steps of this sequence were readily telescoped to 
provide 12 in 80% overall yield after one purification. The relative 
stereochemistry of 12 was unequivocally established by 7-endo-dig 
hydroguanylation”’ (12 — 18, Fig. 2b) followed by carbamate cleavage 
(18 — 19) and X-ray analysis. 

We then investigated the ring-opening of the bicyclic skeleton of 12 
by cleavage of the B-ketoester. Because 12 presents four acidic sites, a 
careful balance between the protonation state of the substrate and the 
basicity of the incoming nucleophile was essential to achieving the 
desired mode of reactivity. After intensive experimentation and optim- 
ization, we found that deprotonation of 12 with n-butyllithium 
(1.0 equiv.) followed by the addition of lithium benzyl octanoate 
(1.8 equiv.) afforded the bicyclic pyrrolidine 15. This cascade sequence 
is thought to proceed by 1,2-addition to the B-ketoester, retro-aldol 
ring-opening, and proton transfer to provide the enolyne 13. 
Isomerization of 13 to the acylallene 14 followed by Michael addition 
of the guanidinyl anion and neutralization of the resulting enolate may 
then provide 15. The addition of 1,3-dimethyl-3,4,5,6-tetrahydro-2- 
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80% (from 3 + 9); (4) n-BuLi, then lithium benzyl octanoate, THF, then 
DMBPU, —78 °C; (5) Hp (1 atm), Pd/C (10 mol%), THE, 23 °C, 49% (two steps); 
and (6) LiOH, THF-H,O (2:1), 0 °C, 75%. b, The relative stereochemistry of 
12 was established by cyclization and deprotection, followed by X-ray analysis. 
Reagents and conditions: (7) AgOAc, ACOH, CH2Ch, 24°C, >99% and (8) 
TFA, CH2Cl,, 0-23 °C, >99%. 


pyrimidinone (DMPU) was necessary to promote the retro-aldol ring- 
opening. Other nucleophiles, such as sodium methoxide, morpholine, 
ethanethiol, and organometallic reagents (for example, Grignard, 
organocerium, organotitanium, or organozinc reagents), were also 
investigated, but in most instances complex mixtures of products were 
obtained. The addition-rearrangement product 15 exists as a mixture 
of diastereomers and tautomers; consequently, the B-ketoester of the 
unpurified product was cleaved with palladium on carbon under dihy- 
drogen, to form the ketone 16 (49% from 12). Saponification of 
the pantolactonyl ester (lithium hydroxide) afforded the keto-acid 
17 (75%; 29% overall from 3). 

The anchor fragment was assembled by the sequence shown in 
Fig. 3, beginning with a highly-diastereoselective Mannich addition” 
to form the B-aminoester of the target. Treatment of 6 with lithium 
diisopropylamide (LDA) and chloro tris(isopropoxy)titanium, fol- 
lowed by addition of the sulfinimine 20, provided the product 21 in 
99% yield. The addition product 21 was formed as a single detectable 
C2 stereoisomer and an inconsequential (approximately 94:6) mixture 
of Cl stereoisomers (‘'H NMR analysis). The C1 and C2 stereocentres 
were assigned by analogy to related products” and the C2 stereochem- 
istry was confirmed by derivatization (see Supplementary Infor- 
mation). Notably, attempts to functionalize saturated analogues of 
6 by a Mannich addition would be complicated by issues of diaster- 
eoselectivity and B-elimination. Owing to the presence of the alkyne 
and the difficulties associated with handling the vinylogous carbamate 
of the target’’, reduction of the pyrrole ring was postponed until later 
in the sequence. The fert-butanesulfinyl substituent of 21 was cleaved 
by treatment with hydrochloric acid in methanol, and the resulting 
product was cyclized in the presence of bis(chlorodibutyltin)oxide to 
provide the urea 22 (78%, two steps). O-Selective ethylation formed an 
iso-urea (90%, not shown) that was treated with 2,4-(dimethoxy)benzyl 
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Figure 3 | Synthesis of the anchor fragment of (+)-batzelladine B 

(1). Reagents and conditions: (1) LDA, Ti(Oi-Pr)3Cl, THF, —78 °C, 99%, 
>20:1 mixture of C1 stereoisomers, approximately 94:6 mixture of C2 
stereoisomers; (2) HCl, CH30H-1,4-dioxane (4.4:1), 0 °C; (3) (CISnBu)20, 
toluene, 100 °C, 78% (two steps); (4) EtOTf, 2,4,6-tri-tert-butyl-pyrimidine, 


(DMB) amine hydrogen chloride to afford the guanidine 23 (71%). The 
ester was then cleaved (trimethylsilyl trifluoromethanesulfonate, 2,6- 
lutidine) and the resultant carboxylic acid was coupled with the alcohol 
24 to provide 25, which contains the complete carbon framework of the 
anchor (75%). Anti-Markovnikov reductive hydration” of the terminal 
alkyne of 25 mediated by the ruthenium catalyst 26 (15 mol%) formed 
the alcohol 27 (71%; 26% overall from 6). The addition of p-toluene- 
sulfonic acid (PTSA) to quantitatively protonate the guanidine was 
essential in this step; in its absence, the conversion of 25 was low. 

The vessel and anchor fragments 17 and 27 were coupled 
using 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide hydrochloride 
(EDC: HC]) to provide the penultimate intermediate 28 (77%), and the 
synthesis was completed by the carefully optimized sequence shown in 
Fig. 4. A dry mixture of palladium on carbon and the coupling product 
28 was suspended in trifluoroacetic acid under argon for 2h at 24 °C. 
Under these conditions, the four tert-butoxycarbony] protecting groups 
were cleaved, the liberated vessel domain underwent cyclodehydration, 


NDMB 
Boc Boc HN N“\ 
NON = 
Tv # 7 6 
-heptyl N 5 
Bene ee <"COoH o~So 
H H 
fe) BocN N 
NHBoc 
17 (vessel) 27 (anchor) 


75% 


CH,Ch, 23 °C, 90%; (5) DMBNH3Cl, 3 A molecular sieves, EtOH, 70 °C, 71%; 
(6) TMSOTE, 2,6-lutidine, CH2Ch, 0-23 °C, then 24, 4-(dimethylamino)- 
pyridine (DMAP), 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide 
(EDC)*HCl, CH2Clh, 0-23 °C, 75%; and (7) 26 (15 mol%), PTSA (1.0 equiv.), 
HCO3H, N-methyl-2-pyrrolidinone (NMP)-H,O (4:1), 23 °C, 71%. 


and the 1,1-disubstituted enamide was isomerized into conjugation 
with the ester (28 — 29). Upon completion of this step (as judged by 
ultra-high-performance liquid chromatography/mass spectrometry 
analysis), the atmosphere within the reaction vessel was replaced with 
dihydrogen. Stirring the resultant mixture for 18h at 24°C effected 
stereoselective reduction of the trisubstituted eneguanidine of the vessel 
(>20:1 diastereometric ratio (d.r.); see Supplementary Information)”*, 
controlled semireduction of the anchor pyrrole with tandem isomer- 
ization of the resultant dihydropyrrole, and cleavage of the DMB sub- 
stituent, to provide 1 in 40% isolated yield (45% by NMR). 

Previous approaches to synthesizing batzelladine alkaloids and 
related natural products have used non-aromatic (aliphatic) nitrogen 
precursors, followed by stepwise adjustments (typically, increases) of 
oxidation state. The approach we have presented proceeds in the 
opposite direction and begins with oxidized nitrogen heteroaromatics, 
followed by carbon-carbon bond-forming reactions and controlled 
reduction to achieve the saturation patterns of the target. This 
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Figure 4 | Coupling of 17 and 27 and completion of the synthesis of (+)-batzelladine B (1). Reagents and conditions: (1) EDC-HCl, DMAP, CH,Ch, 24 °C, 


77% and (2) TFA, Pd/C, argon, 0 °C, then H2, 24 °C, 45% (NMR), 40% (isolated). 
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approach demonstrates additional synthetic pathways that are not 
apparent or viable when starting from aliphatic nitrogen building 
blocks, and tempers nitrogen’s promiscuous and often problematic 
reactivity. An added virtue of this strategy is its dependence on late- 
stage carbon-hydrogen bond-forming reactions, which are among the 
most reliable classes of transformations. 
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Lithospheric controls on magma composition along 
Earth’s longest continental hotspot track 


D. R. Davies!, N. Rawlinson’, G. laffaldano!+ & I. H. Campbell! 


Hotspots are anomalous regions of volcanism at Earth’s surface 
that show no obvious association with tectonic plate boundaries. 
Classic examples include the Hawaiian-Emperor chain and the 
Yellowstone-Snake River Plain province. The majority are 
believed to form as Earth’s tectonic plates move over long-lived 
mantle plumes: buoyant upwellings that bring hot material from 
Earth’s deep mantle to its surface’. It has long been recognized that 
lithospheric thickness limits the rise height of plumes”* and, 
thereby, their minimum melting pressure. It should, therefore, 
have a controlling influence on the geochemistry of plume-related 
magmas, although unambiguous evidence of this has, so far, been 
lacking. Here we integrate observational constraints from surface 
geology, geochronology, plate-motion reconstructions, geochem- 
istry and seismology to ascertain plume melting depths beneath 
Earth’s longest continental hotspot track, a 2,000-kilometre-long 
track in eastern Australia that displays a record of volcanic activity 
between 33 and 9 million years ago**, which we call the Cosgrove 
track. Our analyses highlight a strong correlation between litho- 
spheric thickness and magma composition along this track, with: 
(1) standard basaltic compositions in regions where lithospheric 
thickness is less than 110 kilometres; (2) volcanic gaps in regions 
where lithospheric thickness exceeds 150 kilometres; and (3) low- 
volume, leucitite-bearing volcanism in regions of intermediate 
lithospheric thickness. Trace-element concentrations from sam- 
ples along this track support the notion that these compositional 
variations result from different degrees of partial melting, which is 
controlled by the thickness of overlying lithosphere. Our results 
place the first observational constraints on the sub-continental 
melting depth of mantle plumes and provide direct evidence that 
lithospheric thickness has a dominant influence on the volume and 
chemical composition of plume-derived magmas. 

Plate tectonic theory successfully describes how the lithosphere— 
Earth’s rigid outermost shell—consists of a mosaic of segments that 
move and interact across the surface of our planet. It also accounts for 
the majority of Earth’s volcanism, which is concentrated at plate 
boundaries. However, an important class of volcanism occurs within 
plates or across plate boundaries, often forming linear volcanic chains 
that grow older in the direction of plate motion. Because of their 
geometry, age distributions, topographic expression and geochemical 
characteristics, most of these so-called hotspots are believed to mark 
the surface expression of upwelling mantle plumes’”. 

Around 50 hotspots have been identified at Earth’s surface*’. Of 
these, only ~20% occur on continents and, hence, most of our under- 
standing of mantle plumes comes from hotspot tracks in oceanic set- 
tings. However, oceanic lithosphere is regularly recycled into the 
mantle through subduction, so if we are to understand plume-related 
volcanism before ~200 million years ago (Ma), which constitutes most 
of Earth’s geological record’, we must learn: (1) how plumes interact 
with continental lithosphere; and (2) how this interaction affects the 
chemical composition and erupted volume of lavas at the surface. 
Here, we first combine observational constraints from surface geology, 


geochronology and plate-motion histories to identify Earth’s longest 
continental hotspot track in eastern Australia. We subsequently integ- 
rate constraints from seismology and geochemistry to ascertain how 
regional lithospheric thickness variations influence the volume and 
composition of plume-derived magmas along this track. 

Cenozoic era volcanism in eastern Australia represents one of the 
world’s most extensive intraplate volcanic regions’ (Fig. 1). Three 
types of volcano are identified in the widely used classification of 
Wellman and McDougall", which we also adopt here: (1) central 
volcanoes, which are predominantly basaltic in composition but have 
felsic lava flows or intrusions, with lavas typically produced from 
central vents, often building large volcanic complexes; (2) lava fields, 
which are basaltic, extensive and thin and are often characterized by an 
abundance of small scoria, lava cones and maars; and (3) the leucitite 
suite, which is dominated by low-volume, potassium-rich, leucitite- 
bearing lavas. These volcanic classes are principally distinguished 
petrologically, with central volcanoes distinguished from lava-field 
volcanoes by the presence of felsic rock and both distinguished 
from the leucitite suite by the absence of leucitite’*. However, these 
classes also show considerable differences in age trends: *°Ar—-*’Ar and 
K-Ar geochronological studies demonstrate that both the central vol- 
canoes and the leucitite suite define age-progressive tracks that 
become younger to the south. The tracks identified, so far, include: 
(1) Comboyne—an ~770-km-long track that extends from Fraser 
Island in Queensland to Comboyne in New South Wales, displaying 
a record of volcanic activity from ~32 Ma to 16 Ma (ref. 13); (2) 
Canobolas—an ~760-km-long track, extending from Bunya in 
Queensland to Canobolas in New South Wales, recording volcanism 
from ~24 Ma to 12 Ma (ref. 14); (3) an ~400-km-long track, extend- 
ing across central Queensland from Cape Hillsborough to Buckland 
and recording volcanism from ~34 Ma to 27 Ma (ref. 6); and (4) an 
~650-km-long leucitite-bearing track, extending from Bokhara River 
in New South Wales to Cosgrove in Victoria and displaying a record of 
volcanic activity from ~17Ma to 9Ma (ref. 5). These tracks are 
widely believed to mark the passage of mantle plumes beneath the 
northwards-migrating Australian plate**'*’*. Lava-field volcanics, 
on the other hand, show no such age progression and are thought to 
be generated through an alternative process, with an edge-driven 
model being suggested for the formation of the Newer Volcanics 
Province (NVP)'>”°. 

The central volcanoes of central Queensland and the leucitite 
suite of New South Wales and Victoria have been considered 
unrelated**'?'*””, principally because the volcanic provinces identified 
in each differ dramatically in composition and eruptive volume, and 
they are separated by a volcanic gap of >650km. However, their 
relative locations and ages suggest that they may be the surface 
expression of the same mantle plume, thus constituting a single hot- 
spot track. In Fig. la we test this hypothesis by predicting volcanic 
locations along this track, using the reconstructed absolute motion of 
the Australian plate’’. Specifically, we take the mapped locations of 15 
volcanic centres (see Extended Data Table 1 and Extended Data Fig. 1), 
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Figure 1 | The distribution and classification of eastern Australian 
Cenozoic volcanic centres and their relationship to regional lithospheric 
thickness variations. a, Volcanic centres, coloured following the classification 
of Wellman and McDougall'’, where black, grey and red denote central 
volcanoes, basaltic lava fields and leucitite-bearing volcanics, respectively. Age- 
progressive hotspot tracks are denoted by black dashed lines. We identify Earth’s 
longest continental hotspot track, the Cosgrove track, which extends across 

the Australian continent from Cape Hillsborough (CH) to Cosgrove (C). Stars 
denote a selection of volcanic centres along this track that are coloured by age (in 
Ma), while squares mark predicted volcanic locations, based on reconstructed 
absolute motions of the Australian plate’. If the predicted locations fall within 


dated by *’Ar—*Ar techniques**, and predict their location at the time 
associated with the next dated volcanic centre (see Methods). Stars in 
Fig. 1a denote the locations of these dated volcanoes, while squares 
denote their predicted locations, based upon the reconstruction 
(note that for clarity, only 9 of the 15 dated volcanoes used in our 
analyses are shown: see Extended Data Fig. 2 for a comparable figure 
with all 15 volcanoes). Circles represent our estimate for the uncer- 
tainty in predicted locations, which arises through a combination of: 
(1) uncertainties in the underlying mantle plume diameter and the 
associated melt region’s lateral extent*’; (2) the potential for plume 
drift’°*"; (3) the unpredictability introduced through preferential 
melt extraction pathways”; and (4) uncertainties in the “°Ar—*’Ar ages 
of dated volcanic centres (see Methods and Extended Data Fig. 3). If a 
predicted location falls within these circles, then we consider that 
volcanic centre to be the surface expression of the same mantle plume 
(filled squares in Fig. 1a): 8 of the 9 volcanic centres shown in Fig. la, 
and 14 of the 15 volcanic centres in our full analyses, satisfy this 
criterion. The only volcanic centre that fails to satisfy this criterion is 
Begargo Hill of the leucitite suite, which is consistently located further 
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the circles surrounding each dated volcanic centre (our measure of the 
uncertainty in predicted locations), then we consider that volcanic centre to 

be the surface expression of the same mantle plume: eight of nine volcanic centres 
satisfy this criteria (filled squares), while one does not (open square). The dotted 
line, to the south of Cosgrove, marks the predicted extension of the Cosgrove 
hotspot track, towards its present-day location, which is illustrated in Extended 
Data Fig. 2. B, Buckland; BH, Begargo Hill; BR, Bokhara River; BU, Bunya; 
CA, Canobolas; CO, Comboyne; FI, Fraser Island; NVP, Newer Volcanics 
Province. b, The same volcanic centres, plotted above an estimate of lithospheric 
thickness, highlighting a clear correlation between lithospheric thickness and 
volcanic outcrop and composition along the Cosgrove hotspot track. 


south than is predicted by our reconstruction. We speculate that this 
points towards either a rapid phase of southwards plume motion 
(exceeding 4cm yr ') from ~17 Ma to 15 Ma, a change in the melt 
extraction pathway, or a combination of both. In support of these 
ideas, we note that: (1) variable plume migration rates have been 
observed elsewhere” and are also predicted in global mantle convec- 
tion simulations*”'; and (2) Begargo Hill lies to the south of a region of 
increased lithospheric thickness (discussed later), which may focus 
sub-lithospheric plume material and any associated melt southwards, 
as a result of Australia’s rapid motion towards the north. Most notably, 
however, the northernmost dated leucitite-bearing volcano satisfies 
the aforementioned location criterion, confirming that the central 
volcanoes of central Queensland and the leucitite suite of New South 
Wales and Victoria are the surface expression of the same mantle 
plume. Together, they constitute Earth’s longest continental hotspot 
track, which we call the Cosgrove track. This knowledge, however, 
leads to further questions. Specifically, since these volcanic centres 
are the surface expression of the same mantle plume, why does the 
Cosgrove track display large volcanic gaps? What drives the consid- 
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Figure 2 | Trace-element abundances of volcanic samples along the 
Cosgrove hotspot track. a, Trace-element concentrations (in p.p.m.), 
normalized to primitive mantle values*’, from published analyses of four 
leucitite-bearing and four mantle-derived basaltic samples along the southern 


erable variations in volume and chemical composition of plume- 
derived magmas between the central volcanoes and the leucitite suite? 
Are both of these characteristics related? To answer these fundamental 
questions, we combine observational constraints from seismology and 
geochemistry. 

We first generate a map of lithospheric thickness, by combining 
constraints from recent three-dimensional body-wave tomography 
results”? with the regional Australian Seismological Reference 
Model (AuSREM)” (Fig. 1b) (for further details, including uncertain- 
ties, see Methods and Extended Data Figs 4-6). The main features 
evident in Fig. 1b include: (1) the contrast between thicker lithosphere 
in the centre of Australia and thinner lithosphere to the east, which is 
consistent with a transition from Precambrian central to Phanerozoic 
eastern Australia and then oceanic lithosphere outboard of the contin- 
ental margin; (2) a zone of thin lithosphere, bound to the east by a zone of 
intermediate-thickness lithosphere of similar width, which extends 
southwestwards from ~30°S through central New South Wales into 
northern Victoria’’; and (3) considerable changes in lithospheric thick- 
ness over relatively short horizontal distances. It is generally accepted 
that these lithospheric ‘steps’ will produce complex flows” and, as noted 
previously, this led to the suggestion of an edge-related model for the 
formation of the NVP’*’%”*. Indeed, such edge-related mechanisms are 
probably also applicable to other lava-field volcanism in the region: as 
illustrated in Fig. 1b, all lava-field volcanic provinces lie adjacent to 
substantial steps in lithospheric thickness, above comparatively thin 
lithosphere, thus providing a favourable setting’*"®. 

What remains poorly understood, however, is how mantle plumes 
interact with these lithospheric thickness variations and, specifically, 
how they influence the volume and composition of plume-derived 
magmas. Intriguing trends are evident along the Cosgrove track in 
Fig. 1b: (1) volcanic gaps occur in regions where lithospheric thickness 
exceeds ~ 150 km; (2) the basaltic and felsic central volcanoes of cent- 
ral Queensland occur in regions where lithospheric thickness is less 
than ~110 km; and (3) low-volume, leucitite-bearing volcanism to the 
south occurs, exclusively, in regions of intermediate lithospheric thick- 
ness, with volcanic gaps within the leucitite suite also coinciding with 
regions of thicker lithosphere. These unambiguous trends suggest that 
the thickness of overlying lithosphere is dictating the volume and 
composition of plume-derived magmas, by limiting the rise height 
of the underlying plume and, hence, the degree of partial melting. 
We infer that the underlying mantle plume: (1) cannot rise to shallow 
enough depths to induce decompression melting in regions where 
lithospheric thickness exceeds ~150 km, thus providing an explana- 
tion for the volcanic gaps along the Cosgrove track and placing the first 
observational constraint on the maximum melting depth of mantle 
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and northern sections of the Cosgrove track, respectively (see Methods). b, A 
plot of the Gd/Lu ratio (a proxy for melting depth) against the barium (Ba) 
concentration (a tracer for the extent of melting) for these samples. 


plumes beneath continents (excluding ultra-volatile melts that form 
kimberlites and carbonatites); (2) undergoes high-degree partial melt- 
ing beneath comparatively thin lithosphere to produce basaltic and 
felsic central volcanoes along the northern segment of the Cosgrove 
track; and (3) undergoes very low-degree partial melting in regions of 
intermediate lithospheric thickness, thus facilitating the production 
of low-volume leucitite-bearing volcanics towards the southern end 
of the Cosgrove track. 

To ascertain whether or not these inferences are compatible with 
geochemical observations, we collated previously published trace- 
element data from volcanic outcrops along the Cosgrove track’’?* 
(see Methods and Extended Data Tables 2, 3). Although this data set 
is limited, consisting of only eight data points (four from the central 
volcanoes of central Queensland and four from the leucitite suite), it 
provides a basis for testing our hypothesis. As mantle material under- 
goes partial melting, incompatible trace elements, such as barium (Ba), 
are preferentially transferred into the melt. Indeed, for the most 
incompatible elements, their concentration in the melt is, to a good 
approximation, inversely proportional to the melt fraction. As the 
degree of partial melting increases, their concentrations are subse- 
quently diluted’’. Accordingly, if the available trace-element data sup- 
ported the inferences presented earlier, leucitite-bearing volcanics 
would display higher concentrations of incompatible trace elements 
when compared to the basaltic central volcano samples. Such trends 
are indeed apparent in the trace-element concentrations plotted in 
Fig. 2: leucitite samples display barium concentrations that exceed 
those of basaltic samples by a factor of ~3. Our inference that the melt 
fraction along the Cosgrove track is controlled by lithospheric thick- 
ness, which limits the rise height of plumes, should also leave a dis- 
cernible signature in the trace-element concentrations illustrated in 
Fig. 2. Melting at greater depth, in the presence of higher proportions 
of garnet”®, will sequester the heavy rare-earth elements, such as lute- 
tium (Lu), with respect to the middle rare-earth elements, such as 
gadolinium (Gd). Consequently, samples from the leucitite suite 
should display elevated Gd/Lu ratios when compared to the basaltic 
samples from central Queensland. As illustrated in Fig. 2, this is the 
case: leucitites typically possess Gd/Lu ratios of ~6, compared to ~2-3 
for the basalts of central Queensland. Our hypothesis, therefore, is 
supported by the available trace-element data. 

An aspect of the Cosgrove track that has not been addressed is why 
basaltic and felsic central volcanoes do not re-emerge to the south of 
Cosgrove, in a region of comparatively thin lithosphere. As mantle 
plumes have finite lifetimes*”’, it is possible that the underlying plume 
faded at ~8 Ma, although this would be an unlikely coincidence. We 
speculate that an alternative mechanism is at play: it has previously 
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been demonstrated’* that regional three-dimensional lithospheric 
thickness variations, coupled with rapid northwards motion of the 
Australian plate, gives rise to a focused edge-driven convection 
(EDC) cell to the west of the Cosgrove track, in the vicinity of the 
NVP, thus providing a mechanism for the localization of lava-field 
volcanism to this region. However, an explanation for the onset of 
NVP volcanism, at ~5 Ma (ref. 11), has remained elusive: the litho- 
spheric thickness variations driving EDC in this region are probably 
long-lived”’, which is difficult to reconcile with comparatively recent 
volcanism. Our reconstructions, however, place the mantle plume that 
generated the Cosgrove hotspot track <50 km to the east of the NVP, 
from ~6.5 Ma to 5 Ma. We speculate that the capture and entrainment 
of this plume, into a pre-existing EDC cell, was the trigger for mag- 
matism within the NVP and explains the absence of a hotspot track to 
the south of Cosgrove. In support of these ideas, we note that although 
EDC is expected to occur on all lithospheric steps, including those to 
the east of the plume’s predicted passage, the dominant cell in this 
region lies directly beneath the NVP’*. Accordingly, preferential west- 
wards flow (and, hence, entrainment of plume material) into the NVP 
region, is to be expected, and was indeed evident in our previous 
models'*. To our knowledge, the interaction between mantle plumes 
and EDC has not been documented elsewhere. However, this process: 
(1) has important implications for the surface expression of mantle 
plumes in the vicinity of step changes in lithospheric thickness; and 
(2) provides a solution to the global puzzle of why step changes in 
lithospheric thickness, which occur along craton edges and at passive 
margins, produce volcanism only at isolated locations, thus comple- 
menting our previous study’». Finally, we note that the predicted pre- 
sent-day location of the mantle plume that generated the Cosgrove 
hotspot track lies to the northwest of Tasmania, which coincides with a 
region of recent seismicity and is at the western limit of the so-called 
East Australia Plume System imaged previously using finite frequency 
tomography”? (see Extended Data Fig. 2). 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


*Ar-*’Ar dated volcanic centres. Of the volcanic centres that make up the 
Cosgrove hotspot track, 15 have recently been dated via *°Ar—*’Ar geochronology. 
These include nine volcanic centres from the central volcanoes of Central 
Queensland® and six volcanic centres from the leucitite suite of New South 
Wales and Victoria®. These data are summarized in Extended Data Table 1, with 
localities also illustrated in Extended Data Fig. 1. 

Reconstructed hotspot track. In Fig. 1a, we test the hypothesis that the central 
volcanoes of central Queensland and the leucitite suite of New South Wales and 
Victoria represent the surface expression of the same mantle plume, thus consti- 
tuting a single hotspot track, by predicting volcanic locations along this track, 
using the reconstructed absolute motion of the Australian plate'*. Specifically, 
we take the mapped location of 15 volcanic centres*® (Extended Data Table 1) 
and predict their location at the time associated with the next (in a temporal sense) 
dated volcanic centre. Stars in Fig. 1a denote the locations of these dated volcanoes, 
while squares denote their predicted locations, based upon the reconstruction. 
Note that for illustrative purposes, only 9 of the 15 dated volcanoes used in our 
analyses were shown in Fig. 1a. A comparable figure, for all 15 dated volcanoes, is 
shown in Extended Data Fig. 2. 

In doing this, we have implicitly assumed that the underlying mantle plume is a 

fixed point source at the base of the lithosphere and that melt transport towards 
Earth’s surface is not influenced by pre-existing lithospheric structure. However, 
palaeomagnetic data indicate that mantle plumes drift slowly, at a fraction of 
surface plate velocities***”’, an inference that is supported by global mantle con- 
vection simulations**'. In addition, plume tails have an upper-mantle diameter 
that probably exceeds 200 km and produce a sub-lithospheric melt region that may 
be up to half of this size*!’, while lithospheric structure has an important role in 
controlling the lateral transport of plume material and melt-extraction pathways”. 
We therefore estimate the uncertainty in our predicted locations (Fig. 1a, circles, 
and Extended Data Fig. 2), by allowing for plume motion at a modest rate of up to 
1cm yr’ ', and adding 100 km to account for the melt region’s lateral extent and 
the unpredictability introduced through preferential melt extraction pathways. 
Furthermore, we propagate the uncertainty on the measured ages of volcanic 
samples onto the predicted volcanic locations**. Specifically, we use the measured 
age uncertainties”® to shorten/lengthen the duration of the temporal stages over 
which the instantaneous angular velocities/poles’* are applied. This results in two 
different predictions of geographic positions. We take the geodesic distance 
between these two as a representative value of the uncertainty contribution, on 
the predicted locations, which is associated with age. If a predicted location falls 
within the resulting uncertainty circles, we consider that volcano to be the surface 
expression of the same mantle plume (Fig. 1a, filled squares, and Extended Data 
Fig. 2): 8 of the 9 volcanic centres shown in Fig. 1a and 14 of the 15 shown in 
Extended Data Fig. 2 satisfy this criterion. The sensitivity of our results to the 
uncertainty parameters specified (that is, the rate of plume motion and the melt 
zone’s lateral extent) is illustrated in Extended Data Fig. 3. 
Lithospheric thickness estimate. The lithospheric thickness estimate given in this 
study is derived from recent three-dimensional seismic tomography results'*”’, 
which exploit body-wave arrival time information from the WOMBAT trans- 
portable array in eastern Australia. WOMBAT is the largest experiment of its type 
in the southern hemisphere, which, to date, comprises a cumulative total of ~700 
stations, with spacings varying from 50 km on the mainland to 15 km in Tasmania. 
Extended Data Figure 4 shows the location of the 12 sub-arrays used in this study. 
Note that although two additional arrays have been deployed to the northeast, the 
associated data are not yet available for use. 

FMTOMO™* is used to invert relative arrival time residuals of various global 
P-wave phases for three-dimensional velocity variations. Owing to the dominance 
of short-period seismometers (1 Hz natural frequency) in WOMBAT, S-wave 
arrivals are difficult to identify, which is why we favour P-wave arrivals. Two 
limitations of using relative P-wave arrival times in seismic tomography are that 
absolute velocities are not constrained, and the long-wavelength structure (greater 
than the aperture of the sub-arrays) is filtered out during the inversion. We 
overcome these limitations by using the AuUSREM mantle model” as a starting 
model in the inversion, which has a horizontal resolution of ~200-250km. 
Assuming that AuSREM is accurate at this structural wavelength, the problem 
of suturing together relative arrival time data from different arrays is overcome*® 
and both the absolute and relative velocities can be considered (although the latter 
will be more reliable). 

The use of three-dimensional mantle velocity models to derive estimates of 
lithospheric thickness is common. In Australia, a number of different estimates 
have been made based upon S-wave velocity models obtained from surface wave 
tomography*’’. Proxies for the depth to the base of the lithosphere include a 
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particular choice of velocity contour, a decrease in velocity within a certain 
depth range, and the behaviour of the velocity gradient. In our case, we have a 
three-dimensional P-wave velocity model rather than a three-dimensional S-wave 
velocity model, which is likely to be less sensitive to the presence of a mechanically 
weak sub-lithosphere mantle layer. Moreover, the limited depth resolution caused 
by the sub-vertical nature of the incident paths means that using the depth to the 
base of fast velocity anomalies, or some chosen contour, is unlikely to yield robust 
results, particularly in terms of absolute values. Extended Data Figure 5 shows a 
slice through the P-wave velocity model at 120km depth, which reveals a clear 
division between higher velocities to the west and lower velocities to the east, which 
is consistent with a transition from Precambrian central and western Australia, to 
Phanerozoic eastern Australia and finally oceanic lithosphere outboard of the 
continental margin. Regions of elevated velocities probably correspond to thicker 
lithosphere, as the pattern of velocity variations shares a first-order resemblance to 
the lithospheric thickness estimates of previous studies”. 

The AuSREM model"! also includes a lithospheric thickness estimate, based on 
velocity gradients from surface and body-wave tomography and refracted waves in 
the mantle. This estimate shows a broad consistency with other studies*?°, with 
the main difference being in the absolute depths to the base of the lithosphere 
rather than the pattern of depth variations. Minimum depths vary between 
~50km and 70km in the vicinity of the eastern seaboard, whereas maximum 
depths vary between 180km and 220km in the central western section of our 
model region. On the basis of changes in the vertical gradient of SV-wave velocity, 
one study provides upper and lower bounds on the transition from lithosphere to 
sub-lithospheric mantle. This transition exceeds 50 km at several locations. Both 
the models described in refs 40 and 41 correspond more closely to the lower bound 
estimates of ref. 39. 

Owing to the potential for vertical smearing in the three-dimensional 
WOMBAT model, we instead take the vertically averaged velocity over a pre- 
scribed depth range and calibrate this against minimum and maximum regional 
lithospheric thickness estimates from previous studies. Thus, the maximum aver- 
age velocity will equate to the thickest lithosphere, while the minimum average 
velocity will equate to the thinnest lithosphere, with a linear relationship assumed 
between depth and average velocity. This approach essentially assumes that varia- 
tions in the thickness of a higher-velocity lithosphere above a lower velocity sub- 
lithospheric mantle is responsible for the observed variations in arrival time. 
Although this is a relatively crude assumption, if we apply this scheme to the 
P-wave component of the AUSREM model over a depth range of 50-200 km, then 
the resultant estimate of lithospheric thickness is similar to the AuSREM litho- 
sphere thickness estimate. Extended Data Figure 6a shows the result of using the 
same 50-200 km depth range with the three-dimensional WOMBAT tomography 
model (equivalent to Fig. 1), noting that north of ~28° S, we revert to the AUSREM 
model, as there is no additional data coverage in this region. The pattern of depth 
variations is remarkably similar to the pattern of velocity variations at 120 km 
depth in Extended Data Fig. 5, demonstrating that velocities do not vary signifi- 
cantly with depth in the uppermost mantle. Compared to previous lithospheric 
thickness estimates for the Australian continent, the main difference is that the 
new model includes a zone of (on average) intermediate thickness lithosphere that 
extends southwards from about 30° S through central New South Wales before 
terminating in northern Victoria. This feature is crucial to the results of this study, 
as it hosts the leucitite-bearing volcanics. 

To investigate the robustness of our lithospheric thickness estimate, we perform 
the same calculation but vary the input parameters. In particular, we vary the 
minimum thickness of the model between 50 km and 70 km, and the maximum 
thickness between 180 km and 220 km, which reflects the range of extremes in our 
geographic region, between the models of refs 39, 40 and 41. Moreover, in an 
attempt to account for limited vertical resolution in the mantle and the potential 
for unresolved crustal structure to smear into mantle structure, we vary the min- 
imum depth over which we take the velocity average between 50 km and 100 km 
and the maximum depth from 180km to 230 km. In the former case, the three- 
dimensional tomographic model explicitly includes the AuSREM crustal and 
Moho depth model" to minimize this trade-off effect, so this range is probably 
quite generous. However, with a minimum depth set at 100 km, we are likely to 
ignore shallower anomalies that are unrelated to lithospheric thickness. The range 
of maximum depths was chosen to account for smearing effects at depth; for 
example, if high velocity lithosphere terminating at 200 km depth is recovered 
as slightly lower velocity lithosphere terminating at 230 km depth owing to smear- 
ing, then it may manifest as thicker lithosphere if the vertical averaging is taken toa 
depth of 230 km rather than 200 km. We generate a total of 540 models by iterating 
over these four depth ranges, using an increment of 10km. Extended Data 
Figure 6b shows the standard deviation of this model ensemble. In general, 
the uncertainty is dominated by our chosen range of minimum and maximum 
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lithosphere thicknesses, but there is also a contribution from our choice of depth 
range over which to vertically average the velocity. The end result indicates that, in 
most regions of the lithosphere thickness model, the standard deviation of the 
uncertainty is ~10km or less, implying that the first-order features of Extended 
Data Fig. 6a (and Fig. 1b) are probably correct. Moreover, the pattern of thickness 
variations is more robust than the absolute variations, which are dependent on the 
minimum and maximum thickness bounds that we obtain from previously pub- 
lished models. 
Geochemical samples. Leucitites of New South Wales and Victoria: trace-element 
data from published analyses of leucitite samples along the Cosgrove track are 
summarized in Extended Data Tables 2 and 3. These include three samples from 
Cosgrove in Victoria”® (acquired by solution ICPMS) and one sample from 
Condobolin in New South Wales’ (acquired by XRF). Limited trace-element data 
for three samples at Begargo Hill, Flagstaff and El Capitan in New South Wales, 
from ref. 42 (acquired by isotope dilution), are also included. Given that trace- 
element concentrations are only provided for four elements (Rb, Sr, Nd and Sm) in 
ref. 42, they are not plotted in Fig. 2. Nonetheless, the limited trace-element 
concentrations provided are consistent with more complete data sets'””*. 

Basaltic samples from the central volcanoes of central Queensland: we restrict 
our choice of trace-element data to samples with a composition that is likely to 
reflect their mantle source (that is, SiO. < 50%; MgO > 9%). The selected samples 
also have no negative Ni/Ta or Eu anomalies, which indicates no detectable crustal 
contamination or fractional crystallization within the crust. 

Geochemical data, with the exception of ref. 17, can be downloaded from the 
GEOROC database at EarthChem (http://www.earthchem.org/). 


32. 


33. 


34. 


35. 


36. 


37. 


38. 


39. 


40. 


41. 


42. 


Antretter, M., Steinberger, B., Heider, F. & Soffel, H. Paleolatitudes of the Kerguelen 
hotspot: new paleomagnetic results and dynamic modelling. Earth Planet. Sci. Lett. 
203, 635-650 (2002). 

Tarduno, J. A., Bunge, H.-P., Sleep, N. & Hansen, U. The bent Hawaiian-Emperor 
hotspot track: inheriting the mantle wind. Science 324, 50-53 (2009). 
Rawlinson, N. & Urvoy, M. Simultaneous inversion of active and passive source 
datasets for 3-D seismic structure with application to Tasmania. Geophys. Res. Lett. 
33, L24313 (2006). 

Rawlinson, N., Tkalcic, H. & Reading, A. M. Structure of the Tasmanian lithosphere 
from 3D seismic tomography. Aust. J. Earth Sci. 57, 381-394 (2010). 

Rawlinson, N., Salmon, M. & Kennett, B. L. N. Transportable seismic array 
omography in southeast Australia: illuminating the transition from Proterozoic to 
Phanerozoic lithosphere. Lithos 189, 65-76 (2014). 

Simons, F. J. & van der Hilst, R. D. Age-dependent seismic thickness and 
mechanical strength of the Australian lithosphere. Geophys. Res. Lett. 29, 1529 
(2002). 

Fishwick, S. & Rawlinson, N. 3-D structure of the Australian lithosphere from 
evolving seismic datasets. Aust. J. Earth Sci. 59, 809-826 (2012). 

Yoshizawa, K. Radially anisotropic 3-D shear wave structure of the Australian 
lithosphere and asthenosphere from multi-mode surface waves. Phys. Earth Planet. 
Inter. 235, 33-48 (2014). 

Fishwick, S., Heintz, M., Kennett, B. L. N., Reading, A. M. & Yoshizawa, K. Steps in 
lithospheric thickness within eastern Australia: evidence from surface wave 
tomography. Tectonics 27, TC4009 (2008). 

Kennett, B. L.N. & Salmon, M. AUSREM: Australian seismological reference model. 
Aust. J. Earth Sci. 59, 1091-1103 (2012). 

Nelson, D. R., McCulloch, M. T. & Sun, S.-S. The origins of ultrapotassic rocks as 
inferred from Sr, Nd, and Pb isotopes. Geochim. Cosmochim. Acta 50, 231-245 
(1986). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Hillsborough 
Nebo is 
ef 


Peak Range 


Springsure 
Buckland | 


Byrock i 

El Capitan ‘ 
I 
I 


Begargo Hill ] 
7 % 


Flagstaff Hill 
: Griffith 
l 


Cosgrove 


- 
- 
- 
- 
- 
- 


155°E 


150°E 


145°E 
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Extended Data Figure 2 | The Cosgrove hotspot track. As in Fig. la but (green square to the northwest of Tasmania). The approximate location of the 
incorporating all 15 dated volcanic complexes and extended southwards to East Australia Plume System, imaged previously using finite frequency 
show the predicted present-day location of the underlying mantle plume tomography”, is marked by the dotted green line. 
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predicted volcanic centre locations, from a total of 15 (listed in Extended 
Data Table 1), that fall within the uncertainty circles surrounding the dated 
volcanic centres, for a range of plume drift velocities and melt region diameters 


Extended Data Figure 3 | Reconstruction score map. The number of 
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Extended Data Figure 4 | Location of WOMBAT array stations used to create the three-dimensional P-wave velocity model from which our lithospheric 
thickness estimate was derived. Station spacing is ~50 km, which roughly equates to the maximum horizontal resolution of the three-dimensional velocity 
model. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


20°S 5 


25°S 


30°S 


35°S 


145°E 150°E 155°E 


-0.25 0.0 0.25 
6 Vp (km/s) 


Extended Data Figure 5 | Depth slice at 120 km, through the three-dimensional P-wave velocity model. North of ~28° S, the model reverts to the AuUSREM 
mantle model, owing to a lack of additional data coverage in this region (see Extended Data Fig. 4). 
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Extended Data Figure 6 | Lithospheric thickness estimate and associated 
uncertainty. a, b, Lithosphere thickness model illustrated in Fig. 1b 

(a), alongside an estimate of its uncertainty (b), given by the standard deviation 
(c) of an ensemble of 540 plausible models examined. Note that south of 
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~28 °S, the lithospheric thickness estimate is constrained by high-resolution 
body-wave tomography (~50 km horizontal resolution), whereas north 

of this latitude it is constrained entirely by the AUSREM mantle model 
(~200-250 km horizontal resolution)”*. 
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Extended Data Table 1 | Age estimates, derived via *°Ar-*°Ar geochronology, for the volcanic centres considered in this study 


Location Sample ID Latitude(*() Longitude () Age (Ma) Source 
Mt Jukes, Hillsborough, QLD BC-98 -21.00 148.94 33.640.5 Ref. 6 
Mt Britton, Nebo, QLD DR13704 -21.47 148.59 32.6+0.4 Ref.6 
Mt Pollux, N. Peak Range, QLD BC-88 -22.48 147.87 30.740.4 Ref.6 
Lords Table Mountain, N. Peak Range, QLD BC-90 -22.66 148.02 30.440.5  Ref.6 
Wolfgang Peak, N. Peak Range, QLD BC-89 -22.55 147.83 30.3 40.4 Ref. 6 
Malvern Hill, S. Peak Range, QLD BC-94 -22.88 148.22 29.0 +0.4 Ref. 6 
Ropers Peak, S. Peak Range, QLD BC-95r -22.87 148.22 28.640.4 Ref.6 
Mt St. Peter, Springsure, QLD BC-71 -24.00 148.04 28.140.3 Ref.6 
Hervey’s Knob, Buckland, QLD BC-57 -24.84 147.92 27.3 40.4 Ref. 6 
El Capitan, NSW CPT II-I -31.22 146.20 17.940.3 Ref. 5 
Byrock Quarry, NSW BQ-7 -30.71 146.31 17.140.3 Ref. 5 
Begargo Hill, NSW BEH-V -33.53 146.36 15.540.5 Ref. 5 
Flagstaff Hill, NSW FGH-I -33.80 146.09 15.340.2  Ref.5 
Griffith Quarry, NSW GR-7 -34.23 145.92 14940.4 Ref. 5 
Cosgrove Quarry, NSW E7980 -36.34 145.60 8.9 0.2 Ref. 5 


For illustrative purposes, only nine volcanic centres are shown in Fig. 1a (those coloured in black here). However, all 15 centres were used in our reconstruction of the Cosgrove hotspot track (including those shown 
in grey here). 
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Extended Data Table 2 | Rock type, sample locations, sample numbers and data source, from previously analysed samples along the 
Cosgrove track 


Rock-type Sample location Sample ID Source 
Leucitite Weeber Hill, Condobolin, NSW CND4 Ref. 17 
Leucitite Cosgrove, VIC CGI Ref. 26 
Leucitite Cosgrove, VIC CG2 Ref. 26 
Leucitite Cosgrove, VIC CG3 Ref. 26 
Transitional Basalt Springsure, QLD Q135 Ref. 17 
Ol-Tholeiite Springsure, QLD Q137 Ref. 17 
Basanite Mt. Llandillo, Anakie, QLD AK16/6 Ref. 17 
Basanite Black Peak, Anakie, QLD AK39/1 Ref. 17 
Leucitite Begargo-Hill , NSW GA-3471 Ref. 42 
Leucitite Flagstaff, NSW GA-3481 Ref. 42 
Leucitite El Capitan, NSW GA-3479 Ref. 42 


Associated trace-element concentrations are provided in Extended Data Table 3. The limited trace-element concentrations provided for the three samples from ref. 42 (shaded grey) are not plotted in Fig. 2. 
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Extended Data Table 3 | Trace-element concentrations for a number of samples along the Cosgrove track 
Element PM CND4 CGI CG2 CG3 Q135 Q137 AK16/6 AK39/1 GA-3471 GA-3481 GA-3479 


Cs 0.032 1.05 1.14 1.69 1.05 0.35 0.40 0.20 0.20 - - - 
Rb 0.635 96.0 90.9 112.1 86.7 = 23.3 21.0 = 28.5 26.8 78.9 185 265 
Ba 6.989 1150. 966.0 1011. 964. 335. 267. 373.0 454.0 - - - 
Th 0.095 8.1 8.87 926 863 43 2.9 4.3 7.2 - - - 

U 0.021 1.3 1.72 1.9 2.09 0.5 0.7 0.7 1.8 - - - 
Nb 0.713 96.0 107. 118. 116. 41.0 27.0 54.0 60.0 - - 7 
Ta 0.041 6.2 5.44 5.78 5.83 1.8 1.3 2.8 2.5 - - - 
La 0.687 75.0 79.5 84.6 85.2 26.5 22.5 37.5 50.0 - - - 
Ce 1.775 158.0 162. 168. 169. 57.0 48.0 76.0 89.0 - - - 
Sr 21.10 1154. 1131. 1209. 1089. 569. 432. 777. 923.0 1251 1265 972 
Nd 1.354 74.0 75.6 80.6 806 280 240 35.5 35.5 86.6 92.7 118.3 
Sm 0.444 12.8 13.1 14.3 14.3 5.3 4.0 6.6 73 14.1 15.1 16.7 
Zr 11.20 536. 433. 470. 463. 233. 180. 219. 192. - - - 
Hf 0.309 12.0 919 9.8 9.82 4.2 3.6 4.0 3.3 - - - 
Eu 0.168 3.75 3.88 4.20 4.18 2.05 1.75 2.45 2.3 - - - 
Gd 0.596 9.60 11.5 12.0 12.2 5A 4.8 6.6 6.0 - - - 
Tb 0.108 1.20 1.37 1.46 144 0.80 0.61 1.0 1.0 - - - 

Y 4.550 26.0 31.4 33.5 31.8 248 240 23.8 21.5 - - - 
Ho 0.164 = 1.05 1.08 1.16 1.13 0.70 0.86 1.15 1.0 - - - 
Yb 0.493 1.55 1.75 1.86 1.78 1.70 1.80 1.90 L7 - - - 
Lu 0.074 0.20 0.239 0.248 0.248 0.25 0.27 =0.26 0.24 - - - 


See Extended Data Table 2. Concentrations are in p.p.m. Although not plotted in Fig. 2, the limited trace-element concentrations (Rb, Sr, Nd and Sm) for leucitites provided in ref. 42 (that is, GA-3471, GA-3479 and 
GA-3481) are consistent with the concentrations provided in the more complete data sets of refs 17 and 26. PM denotes primitive mantle concentrations from ref. 31. 


©2015 Macmillan Publishers Limited. All rights reserved 


Mae Ae dL fea 


doi:10.1038/nature14952 


Novel competitors shape species’ responses 


to climate change 


Jake M. Alexander’, Jeffrey M. Diez” & Jonathan M. Levine! 


Understanding how species respond to climate change is critical for 
forecasting the future dynamics and distribution of pests, diseases 
and biological diversity’ *. Although ecologists have long acknowl- 
edged species’ direct physiological and demographic responses to 
climate, more recent work suggests that these direct responses can 
be overwhelmed by indirect effects mediated via other interacting 
community members”’. Theory suggests that some of the most 
dramatic impacts of community change will probably arise 
through the assembly of novel species combinations after asyn- 
chronous migrations with climate* '°. Empirical tests of this pre- 
diction are rare, as existing work focuses on the effects of changing 
interactions between competitors that co-occur today”''’. To 
explore how species’ responses to climate warming depend on 
how their competitors migrate to track climate, we transplanted 
alpine plant species and intact plant communities along a climate 
gradient in the Swiss Alps. Here we show that when alpine plants 
were transplanted to warmer climates to simulate a migration fail- 
ure, their performance was strongly reduced by novel competitors 
that could migrate upwards from lower elevation; these effects 
generally exceeded the impact of warming on competition with 
current competitors. In contrast, when we grew the focal plants 
under their current climate to simulate climate tracking, a shift in 
the competitive environment to novel high-elevation competitors 
had little to no effect. This asymmetry in the importance of chan- 
ging competitor identity at the leading versus trailing range edges 
is best explained by the degree of functional similarity between 
current and novel competitors. We conclude that accounting for 
novel competitive interactions may be essential to predict species’ 
responses to climate change accurately. 

Climate change will alter species’ competitive environments through 
initial shifts in the performance and relative abundance of their current 
competitors, and longer-term changes in the identity of their compe- 
titors caused by migration and local extinctions’. Empirical studies of 
the shorter-term changes in neighbour abundance provide evidence 
both for”!’"*> and against’*"* the importance of competitive interac- 
tions in mediating the impact of climate change. However, these results 
may underestimate the potential role of changing competition. Over 
longer timescales, species will experience competition from new and 
functionally different migrants, and if they themselves migrate to 
track climate change, they will probably encounter new resident 
competitors”’. Despite the potential importance of these novel com- 
petitive interactions in determining species’ persistence and future 
distributions with climate change'®’”, empirical evidence is scant for 
two reasons. First, in most systems, the combinations of species that will 
face one another in the future is highly uncertain. Second, the logistical 
challenges associated with experimentally assembling hypothetical 
future communities, and doing so under realistic climate scenarios, 
are typically prohibitive. 

Elevation gradients in mountains provide a unique opportunity to 
test how changing competitor identity will affect species’ responses to 
climate change. The steep climate gradient in these environments 


means that the novel competitors that species will face following 
climate warming are those already occurring only hundreds of metres 
away. Furthermore, perennial grasslands in these regions lend them- 
selves to whole-community transplantation along climate gradients. 
We experimentally simulated the endpoints of the spectrum of com- 
petitive environments that an alpine species will experience following 
climate change at the leading and trailing edges of its range (Fig. 1). At 
its trailing range edge, a species that fails to migrate will experience 
warmer climate and compete with either its current community mem- 
bers (scenario 1 in Fig. 1), or with a novel community composed of 
species that have migrated upwards from lower elevation (scenario 2). 
By contrast, at the leading edge of its range, a species migrating to 
higher elevations to track its current climate will compete either with 
its current competitors if they also migrate (scenario 3) or with a novel 
higher-elevation community that has persisted in place (scenario 4). 
To simulate these scenarios, we transplanted focal alpine species and 
intact plant communities along an elevation gradient in the Swiss Alps 
(Table 1), and followed their performance for 2 years. To simulate 
scenarios in which focal species and/or communities fail to migrate 
and thus experience warmer temperatures, we moved focal plants 
and/or communities to a lower-elevation site. To simulate scenarios 
in which focal species and/or communities migrate to track current 
climate and thus experience little change in temperature, we trans- 
planted them back into their current elevation site. The direction of 


transplantation is thus meant to reflect future climate conditions, not 


i? ale? 


Figure 1 | Scenarios for the competition experienced by a focal alpine 
plant following climate warming. If the focal plant species (green) fails to 
migrate, it competes either with its current community (yellow) that also fails to 
migrate (scenario 1) or, at the other extreme, with a novel community (orange) 
that has migrated upwards from lower elevation (scenario 2). If the focal 
species migrates upwards to track climate, it competes either with its current 
community that has also migrated (scenario 3) or, at the other extreme, with a 
novel community (blue) that has persisted (scenario 4). Table 1 describes 

the experimental implementation of these scenarios. 
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migrate upwards 


Focal species 
migrates upwards 


Scenario 3 Scenario 4 


: Te) 


Scenario 1 Scenario 2 | 
% % 
2,000 m ‘\ | i 
edn. alle NE. 


2,600 m 


Institute of Integrative Biology, ETH Zurich, Universitatstrasse 16, 8092 Zurich, Switzerland. ?Department of Botany and Plant Sciences, University of California Riverside, 900 University Avenue, Riverside, 


California 92521, USA. 


24 SEPTEMBER 2015 | VOL 525 | NATURE | 515 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Table 1 | Experimental manipulations corresponding to the different competitive scenarios experienced by a focal plant following climate 


warming 
Scenario Focal species’ response to warming Origin of focal 
species 
1 2,000 m 
2 Focal species fails to migrate and experiences warming 2,000 m 
3 Focal species migrates up to track climate 2,000 m 
4 Focal species migrates up to track climate 2,000 m 


the future location of the species (Table 1). Plants moved downhill 
experienced an average daily climate warming of around 3°C 
(Extended Data Fig. 1 and Extended Data Table 1), which reflects the 
magnitude of climate change predicted for the next 50-100 years in 
Switzerland'’. While abrupt climate change experiments, as imposed 
here, mimic future conditions”’, testing more gradual species’ responses, 
such as adaptation, requires other approaches. We tested the influence 
of the four migration scenarios on the performance of four focal alpine 
species: Anthyllis vulneraria ssp. alpestris (alpine kidney vetch, hereafter 
A. alpestris), Plantago atrata (black plantain), Pulsatilla vernalis (spring 
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(as if tracking climate warming) 
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Focal species fails to migrate and experiences warming 


Competitor scenario Origin of Elevation of 
competitors transplant site 
Current competitors persist in warmer climate 2,000 m 1,400 m 
Low-elevation competitors migrate up and replace 1,400 m 1,400 m 
current competitors 
Current competitors migrate up to track climate and 2,000 m 2,000 m 
replace high-elevation competitors 
2,600 m 2,000 m 


High-elevation competitors persist in warmer climate 


pasqueflower) and Scabiosa lucida (glossy scabious). These species differ 
in their dispersal potential (Extended Data Table 2), and their current 
ranges do not effectively extend to either the lowest- or highest-elevation 
field sites (see Methods and Table 1). 

The response of the focal species to novel competitors depended on 
whether they grew at the experimental site with warmer or current 
climate conditions (Fig. 2; significant novel competitor X site, or novel 
competitor X site X species interactions in Table 2). When the focal 
species experienced increased temperature (transplantation to lower 
elevation to simulate climate warming at the trailing edge of their 


Figure 2 | Effect of novel competitors on 
alpine plant performance. Survival over 2 years 
(a, b), second year biomass (c, d) and second 
year flowering (e, f) of focal species exposed to 
different potential competition scenarios following 
climate warming (see Table 1). Shown are 
means (s.e.m.) of the raw data. When the novel 
competitor X site X species interaction was 
significant (a—d), P values for species-by-site 
specific contrasts were taken from the full model 
(see Table 2 for statistics and 7), else from site- 
specific contrasts averaging over species 

(e, f); P values <0.005 remain significant 

(a = 0.05) after Holm—Bonferroni correction for 
multiple comparisons. 


P=0.095 
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Table 2 | Statistical analysis of focal alpine plant performance 


Survival Biomass Flowering 
Source df. P x P e P 
ovel competitor(NC) 1 2.53 0112 001 0.944 0.03 0861 
Site (S) 1 32.38 <0.001 <0.01 0991 1.08 0.298 
Species (Sp) 3 67.21 <0.001 55.98 <0.001 61.87 <0.001 
cxs 1 2855 <0.001 26.04 <0.001 5.28 0.022 
C x Sp 3 093 0818 1408 0.003 564 0.131 
S x Sp 3 1234 0.006 1881 <0.001 17.41 0.001 
CxS xSp 3 10.14 0017 1016 0.017 502 0.170 
n(n blocks) 473 (20) 291 (20) 363 (20) 
Shown are likelihood ratio tests for novel competitor (current versus novel competitors), site (1,400 or 
2,000 m experimental site), and species effects and their interactions on survival after 2 years, and 
biomass and flowering probability in the second year of the experiment. Also shown are the total 


number of observations (n) and experimental units (n blocks) for each model. 


range), their performance 2 years after transplantation depended 
strongly on the origin of their competitors (Fig. 2). For three of four 
species, survival was reduced by 52-84% (Fig. 2a), biomass by 48-61% 
(Fig. 2c; n.s. for A. alpestris) and flowering by over 72% (Fig. 2e) when 
competing against a novel, low-elevation plant community (scenario 2) 
compared with their current alpine community (scenario 1). The bio- 
mass reduction due to these potential migrants from lower elevation 
was significant even in the first year of the study (Extended Data Fig. 2; 
novel competitor ¢ = 17.66, df.=1, P<0.001). We found much 
weaker effects of changing competitor identity when focal species were 
transplanted back into their current elevation to simulate migration 
and climate tracking. Here, whether focal species competed with a 
novel high-alpine community (scenario 4) or their current community 
(scenario 3) had no significant effect on survival (Fig. 2b) or flowering 
(Fig. 2f), and modest, largely non-significant effects on biomass 
(Fig. 2d). The one exception was the strong response of A. alpestris 
biomass to novel competitors, but this response was replicated when it 
grew without any competitors on the soils from the two elevations 
(2 = 7.31, d.f. = 1, P = 0.007; Extended Data Fig. 3), suggesting a lim- 
ited role for shifting competitor identity. 

Plant performance in our experiment might be affected by factors 
other than competitor identity that differ between the communities, 
including soil chemistry and biota. To evaluate soil effects on plant 
growth, we grew the focal species at each site without competition on 
soil originating from each elevation. We found that focal species 
tended to grow better at lower- versus higher-elevation (site 
van = 24.31, df. = 1, P< 0.001) on a common 2,000 m soil, but their 
response to soil origin never matched significant biomass responses to 
novel competitors, suggesting that the observed changes in perform- 
ance in Fig. 2 were indeed due to shifting plant competition (with the 
exception of A. alpestris as mentioned above; Extended Data Fig. 3). 
We also conducted a follow-up greenhouse experiment to isolate the 
effects of soil biota from different elevations. Soil organisms could 
affect plant competition if they fail to migrate synchronously with 
the plant communities in the future. Results suggest that soil biota 
from the different elevations did not affect the relative performance 
of alpine versus sub-alpine plant competitors (Extended Data Fig. 4 
and Extended Data Table 3). Related to this, we did not find differences 
in the incidence of herbivory across the two community types at the 
low-elevation site (where competitor identity effects were strong), 
except for two species in the first year of the study only (and this did 
not relate to subsequent survival or biomass; Extended Data Table 4). 

In sum, our results show that novel competitors strongly affected 
the performance of alpine plants under increased temperatures, as will 
occur at the trailing edge of their range, but had little effect on plants 
under current temperatures, as would occur following range expansion 
to higher elevation. This asymmetry in the importance of competitor 
identity at the leading versus trailing range edges can be explained 
by the greater functional similarity between the high- and middle- 
elevation communities, measured with field-based trait measurements 
on 61species. The low- and middle-elevation communities were 
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2.4 times further apart along the first principal component of trait space 
(Fig. 3a; explaining 76% of the variation in community-weighted trait 
means) and their comparison produced seven times the F statistic in a 
permutation-based multivariate analysis of variance (MANOVA) 
(F,,27 = 52.94 versus 7.56, P< 0.001 and P< 0.01, respectively) than 
the middle- and high-elevation communities. The greater functional 
similarity of the two higher-elevation communities was caused by 
shared functional traits, particularly leaf size, leaf mass and plant 
height, not shared species; the communities were equally distinct in 
their species composition (Fig. 3b; 1,400 versus 2,000m permuta- 
tion MANOVA, F\27 = 27.74; 2,000 versus 2,600m Fi27 = 24.63; 
P<0.001 for both comparisons). 

Finally, first-year biomass results (before heavy mortality in the 
second year of the experiment) allowed us to compare the effect of 
warming on our focal species’ interactions with current competitors 
with the effect of community changes that will arise from competitor 
migration and local extinction. We found that when the focal alpine 
plants grew with their current alpine community under warmer tem- 
peratures, they experienced greater competition than under current 
temperatures (pink versus white bars in Fig. 4), but these effects were 
weak and not significant for any species. This result is consistent with 
the mixed results from previous studies of short-term competitor 
dynamics under climate change”''"'*. By contrast, for P. atrata and 
S. lucida, significantly greater effects of competition arose from chan- 
ging competitor identity and warmer conditions (Fig. 4). These are also 
two of the three species with traits predicting relatively poor dispersal 
(Extended Data Table 2) and thus interactions with novel low-eleva- 
tion competitors may determine their eventual persistence. This result 
further suggests that the strongest effects of climate change on com- 
petition in this system are likely to occur after the immigration of novel 
competitors at species’ trailing range edges. 

Our study provides some of the first empirical evidence that 
accounting for novel competitors may be important to predicting 
species’ responses to climate change’’”®°. Specifically, our results sug- 
gest that species’ range dynamics probably depend not only on their 
ability to track climate, but also the migration of their competitors, and 
the extent to which novel and current competitors exert differing 
competitive effects. In our system, populations might persist at their 
trailing range edge in areas soon to be warmer, as long as lower-eleva- 
tion migrants fail to arrive. This prediction parallels results from the 
few mechanistic studies of population decline following climate 
change, where changing biotic interactions appear more important 
than direct physiological effects of warmer temperature’. However, 
our results also suggest that, in some cases, changing competitor iden- 
tity may be less important. We found, for example, that the shift to 
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Figure 3 | Functional and floristic community composition. Ordinations of 
ten replicate communities from sites at 1,400 m, 2,000 m and 2,600 m elevation 
based on (a) a principal component (PC) analysis of community-weighted 
means of five functional traits (SLA, specific leaf area; LDMC, leaf dry matter 
content), and (b) a principal coordinates analysis of floristic composition 
based on Bray—Curtis dissimilarity. In a, arrows show the loading of individual 
traits on each principal component axis (loadings have been multiplied by 2 
for clarity). 
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Figure 4 | The response of four alpine species to competition. Plants grew 
with either (1) their current competitors and climate (in a site at 2,000 m), 
(2) their current competitors and warmer climate (growing at 1,400 m) or 

(3) novel competitors from low elevation and warmer climate (growing at 
1,400 m). Shown are mean log response ratios (s.e.m.) of above-ground biomass 
calculated from plants growing with or without competitors (1 = 25, 30, 27, 29, 
for each species, respectively). Different letters below the bars for each 
species indicate significantly different contrasts (Tukey’s honest significant 
difference test, P< 0.05). 


novel high-alpine competitors is unlikely to influence the range expan- 
sion of focal species to higher elevation, in agreement with the rapid 
migration of many species upslope with recent climate warming”. 
Future work combining species’ functional traits, detailed distribution 
information and ecological theory may prove particularly useful for 
forecasting how novel competitive interactions determine the response 
of biological diversity to climate change”**. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Field transplant experiment. We selected three perennial grassland sites (1,400, 
2,000 and 2,600 m above sea level) in the Swiss Alps (Calanda mountain, Canton 
des Grisons), which are all dominated by compact turfs, and contain different, 
overlapping sets of species (Fig. 3). The sites are at maximum 3 km apart with 
similar southeast exposure, slope and calcareous bedrock, but span a steep climate 
gradient, with a temperature range of 6°C from subalpine (1,400 m) to higher 
alpine (2,600 m) sites, as measured over the duration of the experiment (Extended 
Data Fig. 1, Extended Data Table 1). The temperature differences between the 
lower and middle sites, and the upper and middle sites, during the experiment 
were on average 2.6 °C and 3.4 °C, respectively. Precipitation changes were smaller 
but decreased by approximately 16% with both 600m drops in elevation 
(Extended Data Table 1), also consistent with expectations for climate change™*. 
The lower and middle sites are managed as summer pasture, and the upper site is 
grazed by native ungulates. 

At the end of August 2012, 75 cm X 75 cm turfs containing intact plant com- 
munities, including roots and the organic soil layer, were excavated at each site toa 
depth of 20 cm. To implement the design in Table 1, the site at 1,400 m received ten 
transplanted communities from the 2,000m site and ten communities trans- 
planted from other locations at the 1,400 m site. Meanwhile the site at 2,000 m 
received ten transplanted communities from the 2,600 m site and ten from other 
locations at the 2,000 m site (Table 1). Soil was obtained from each site (after 
removing the vegetation) and transplanted across sites in the same design. At 
the two transplant destination sites, each treatment (two communities and two 
soils) was assigned at random to one of four plots within each of ten blocks (giving 
80 plots in total). Blocks were separated by 1 m, with 0.5m between treatments 
within blocks. 

Focal individuals of four alpine species (A. vulneraria ssp. alpestris, P. atrata, P. 
vernalis and S. lucida) were obtained by cutting 240 plugs (about 3 cm diameter) 
containing a single adult plant from the 2,000 m site. These species are widespread 
at the 2,000 m site and are either not found or extremely rare in the communities at 
the 1,400 m and 2,600 m sites. Three individuals per species were planted at ran- 
dom, 15cm apart in a grid within each treatment and block (n = 30 per species, 
treatment and site). The lower two transplant sites were fenced to exclude cattle, as 
well as marmots at the 2,000 m site (marmots are not seen at the other sites). 

To minimize transplantation-related issues, we transplanted the communities 
and focal plants in late summer/early autumn, after the plants had already begun 
to senesce, so that they would first experience their new climate in their growth 
phase when they emerged the following spring. The communities were clipped to 
reduce evapo-transpirative stress during transplantation, and the transplants were 
watered and protected with shade cloth for 1 week after transplantation. Any focal 
plants that died within the first 3 weeks were replaced. We note that focal indivi- 
duals and communities did not respond to transplantation in ways that would 
suggest they were poorly adapted to climate or other conditions at lower elevation: 
2,000 m focal plants growing without competition on soils from 2,000m were 
larger when transplanted downslope than when transplanted to the same eleva- 
tion, and grew better on the soil from the lower elevation (1,400 m) site (Extended 
Data Fig. 3). The above-ground biomass of the intact transplanted communities 
was unaffected by transplantation downslope (Extended Data Fig. 5). 

The survival, phenology and number of inflorescences of every focal individual 
were monitored every 2 weeks after snowmelt in 2013 and 2014 (deaths occurring 
in 2014 were confirmed by a final check for surviving plants in early summer 
2015). We do not report flowering incidence in the first year of the study because 
flowering in many alpine plants is determined by conditions in the preceding 
year” (results were generally non-significant). Each focal individual’s leaf number 
and longest leaf length were recorded at planting and in August/September 2013; 
leaf number and the average area of the three largest leaves were measured in 
September 2014. Above-ground biomass of each focal individual was estimated 
from linear models predicting the biomass of destructively harvested individuals 
from outside the experiment (n = 27-39), dependent on the non-destructive mea- 
sures in either 2013 or 2014, including maximum number of inflorescences in 
2014 for all species except P. vernalis (R” = 0.80-0.98). Regressions were forced 
through the origin, and negative biomass estimates (possible with interactions) set 
to 0.01 g. Individuals that died up to a month after snowmelt at each site in spring 
2013 were considered to have died of transplant shock and excluded from further 
analysis. Community above-ground biomass was estimated towards the end of 
each summer using pin quadrats calibrated with destructive harvests made at each 
site (n = 20 plots at the 1,400 m site, m = 18 at 2,000 m, m = 10 at 2,600 m). The 
composition and cover of species in each replicate community were determined in 
2013 in mid-May (1,400 m) and mid-June (2,000 m), and again in late August/ 
early September (all sites). Temperature and light intensity were recorded at 
30-min intervals using at least one HOBO Pendant data logger (UA-002-64, onset, 
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www.onsetcomp.com/) at each site. At the end of each season the communities 
were clipped to approximate biomass removal by grazers. 

Statistical analysis of field experiment. No statistical methods were used to 
predetermine sample size. The investigators were not blinded to allocation during 
experiments and outcome assessment. 

The effects of novel competitors on the biomass of focal plants transplanted 
into competitor communities from different locations were analysed with 
mixed-effects models fitted by maximum likelihood. The full model contained 
main effects and all interactions of the ‘novel competitor’ treatment (novel 
versus current competitors), ‘site’ (1,400 or 2,000m experimental sites) and 
‘species, as well as initial size, as fixed effects, and plot nested in block as 
random effects to account for the dependency of observations. Biomass and 
initial size were log-transformed to meet model assumptions of normality and 
homogeneity of variances. The ten replicate communities per treatment and site 
were the biological replicates in our experiment, and this number was chosen to 
be in excess of previous studies that investigated effects of climate change on 
competition within communities (for example, refs 7, 14). Survival until the end 
of 2014 and the probability of flowering in 2014 were analysed in a similar way, 
but using generalized linear mixed models with a binomial family. We tested for 
over-dispersion (that is, clustering of the binary outcomes within species within 
plots) and found no significant contribution based on a likelihood ratio test. The 
statistical significance of individual terms was determined by comparing each 
model with the correspondingly reduced model using likelihood ratio tests. 
When the novel competitor X site X species interaction was significant, the sig- 
nificance of the novel competitor effect for each species by site combination was 
obtained from contrasts within the full model. Significant novel competitor x 
site interactions in the full model (without a significant three-way interaction) 
were followed by fitting site-specific models, and testing the novel competitor 
effect using likelihood ratio tests. When the direction of a species’ significant 
biomass response to novel competitors paralleled its response to the bare soil 
from the two competitor communities, we also tested the effect of soil itself. In 
such cases, we used likelihood ratio tests to test the effect of soil origin on 
biomass for the species-site combination of interest. All models were fitted in 
R*® using the Ime4 package. 

A different statistical model was used to compare the response of each focal 
species to competition from its current competitors under current versus warmer 
climate, or to novel competitors under warmer climate. For each species and block, 
log response ratios of biomass (that is, In(biomass with competition/biomass 
without competition)) were calculated on the basis of the biomass of individuals 
competing with a particular plant community versus growing alone on soil from 
the same community. Differences in competitive responses between treatments 
were tested using Tukey’s honest significant difference tests within a linear model 
fitted for each species. We used data from 2013 owing to the high mortality and 
correspondingly low replication for these tests in 2014. 

To investigate the functional composition of each community, data on five 
functional traits (plant vegetative height, specific leaf area (mm* g™'), leaf size 
(mm/7), leaf mass, leaf dry matter content (mg g')) were collected from plants 
growing at the field sites in 2014 using standard methods” for 61 species that 
collectively accounted for 89.4 + 4.8% (mean + s.d.) of the relative cover in these 
communities (n = 26 or 10 per species for height or leaf traits, respectively). 
Community-weighted means of each trait were calculated by summing the trait 
values of species within each replicate community, weighted by their relative cover. 
Differences between pairs of communities (using a subset of ten replicate com- 
munities from each elevation to ensure equal sampling effort) in terms of com- 
munity-weighted functional trait means were analysed with a permutation-based 
MANOVA (function ‘adonis’ in the R package ‘vegan’, using a Euclidean distance 
matrix), and visualized using a principal components analysis. The same analysis, 
but based on a Bray—Curtis distance matrix, was applied to community differences 
in floristic composition in 2013. The effect of site (1,400 m, 2,000 m, 2,600 m) was 
highly significant in both cases (F227 > 26.18, P< 0.001), and these analyses were 
followed by pairwise contrasts between the high and middle sites and between the 
middle and low sites as reported in the main text. Compositional differences were 
visualized using principal coordinates analysis on a Bray—Curtis distance matrix of 
log-transformed cover values (mean of the two sampling dates) for the same 
61 species. 

Soil biota experiment. Whether the microbial community at each elevation will 
migrate in unison with the plants from those elevations is not clear. We therefore 
conducted a follow-up greenhouse experiment to investigate potential effects of 
the soil biota from the lower elevation (1,400 m) and alpine (2,000 m) sites on the 
relative performance of lower versus higher-elevation competitors. We grew three 
plant species from the 1,400m site and three 2,000m focal species (all but 
A. alpestris) with soil inoculum from the 1,400m and 2,000 m sites. The back- 
ground soil was a mixture of soil collected at the 1,400 m site and a 2,200 m site, 
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which had been sieved, homogenized and sterilized. We sterilized by autoclaving 
at 121 °C for 20 min, and again after a 2 day incubation period. Live soil inoculum 
was collected from soil cores from the top ca. 15 cm of the soil profile at the 1,400 
and 2,000m sites in October 2014, and sieved and stored at 4°C before use. 
Seedlings were germinated from field-collected seed on filter paper, and then 
transplanted as a single individual to a 360ml pot containing sterilized back- 
ground soil and an inoculum (9% of total soil mass) of live soil from one of the 
elevations. One pot from each species/soil combination was arranged at random 
within a block (n= 10blocks) on a single bench in a glasshouse in Zurich, 
Switzerland (set to 20°C, 14hday, with supplementary lighting). Each pot 
received its own drip tray to minimize cross-contamination during watering. 
After 3 months, above-ground plant parts were harvested to determine dry mass. 
For each species, a linear mixed-effects model containing soil community origin 
as a fixed effect and block as a random effect was fitted by maximum likelihood, 


and compared with a simpler model without soil community using a likelihood 
ratio test. 
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Extended Data Figure 1 | Daily mean temperature during the study at the three experimental sites. 
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Extended Data Figure 2 | Effect of novel competitors on alpine plant 
biomass in 2013. Focal species were exposed to different competition 
scenarios, depending on whether they and/or their surrounding community 
would either migrate, or fail to migrate, following climate warming (see Fig. 1). 
Shown are means (s.e.m.) of the raw data, and likelihood ratio tests 
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(d.f. = 1, = 182 (a) and 221 (b), n = 10 experimental units (blocks) per site) of 
the novel competitor effect at each experimental site (in the main model, across 
all species and sites: novel competitor X site interaction va = 8.42, d.f. = 1, 
P= 0.004; novel competitor X site X species interaction ¢ = 3.17, df. = 3, 
P= 0.367). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a Focal species at 1,400 m b Focal species at 2,000 m 
6 6 
Se 9° 5 
ore B Soil from 2,000 m O Soil from 2,000 m 
= A B Soil from 1,400 m 4 B Soil from 2,600 m 
NA 
4 
non 3 3 
n 
© 
—E 
2 
oa 


2 2 

‘ ‘o 5 comm Comm Co 
% e 
 % 


14 , 14 ¥ & inf 
Nay \ N\A \ 
; we WY 7 
Anthyllis Plantago Pulsatilla Scabiosa Anthyllis Plantago Pulsatilla Scabiosa 
alpestris atrata  vernalis lucida alpestris atrata_ vernalis lucida 
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growing on soils without competition. Plants grew under a warmer climate means (s.e.m.) of the raw data (total n = 314). 
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with soil biota originating from lower elevation, but this effect was shared relative performance of 1,400 and 2,000 m plants. Shown are means (s.e.m.) of 
across species from 2,000 m (in yellow, focal species from the field experiment) _ standardized plant biomass. For statistics and n see Extended Data Table 3. 
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Extended Data Figure 5 | Above-ground community biomass. Standing biomass was estimated in late summer 2013 (a) and 2014 (b) in the plant communities 
from sites at 1,400, 2,000 and 2,600 m (mean + s.e.m., n = 10 per community and site), growing in sites at either 1,400 m, 2,000 m or 2,600 m. 
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Extended Data Table 1 | Environmental characteristics of the three study sites 


Variable 1,400 m site 2,000 m site 2,600 m site 
Latitude (°N) 46.8692 46.8879 46.8931 
Longitude (°E) 9.4900 9.4895 9.4705 
Aspect SE SE SE 
Daily mean temp. 2013 (°C) 6.8 4.2 1.2 
Daily mean temp. 2014” (°C) 7.7 4.6 0.4 
Two year average of daily mean temperature (°C) 7.0 4.4 1.0 
Interpolated annual precipitation (mm) 1169 1355 1573 


Temperatures were determined from temperature loggers placed at each site (see Extended Data Fig. 1). Precipitation data were obtained from interpolations of Swiss climate (ca. 1961-1990) at 50m 
resolution®®. 
*Until 2 July 2014. 
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Extended Data Table 2 | Characteristics of the focal species 


Seed mass Inflorescence Terminal velocity Predicted maximum 


Species Family Growth form (mg) height (cm) of seeds (m/s) dispersal (m) 
Anthyllis alpestris Fabaceae Erect 4.63 8.0 15 1.9 
Plantago atrata Plantaginaceae _‘ Rosette 2.98 6.5 75 0.2 
Pulsatilla vernalis Ranunculaceae — Erect 1.73 6.0 1.0 6.9 
Scabiosa lucida Caprifoliaceae Rosette 1.14 13.0 2.1 0.6 


Terminal velocity was taken from the Dispersal and Diaspore database**. Maximum dispersal was predicted from species’ family, terminal velocity and dispersal mode (only P. vernalis seeds are specialized for 
wind dispersal), following statistical model 1 of ref. 30 implemented in R with the function ‘dispeRsal’ where growth form = herb for all species. 
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Extended Data Table 3 | Statistical analysis of the effects of soil biota on plant biomass 


Species Seed origin n x P 

Helianthemum nummularium 1,400 m 20 088 0.349 
Plantago lanceolata 1,400 m 16 8.13 0.004 
Plantago media 1,400 m 20 8.27 0.004 
Plantago atrata 2,000 m 20 6.72 0.010 
Pulsatilla vernalis 2,000 m 12 0.37 0.544 
Scabiosa lucida 2,000 m 18 5.38 0.020 


All models contained block as a random effect; d.f.= 1 for 7” tests of the effects of soil biota (from 1,400 or 2,000 m above sea level) on above-ground biomass. 
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Extended Data Table 4 | Analysis of herbivory on four alpine plant species 


Year Species 
Anthyilis alpestris 
Plantago atrata 
2013 


Pulsatilla vernalis 


Scabiosa lucida 


Anthyllis alpestris 


Plantago atrata 


2014 


Pulsatilla vernalis 


Scabiosa lucida 


Site 
1,400 m 
2,000 m 
1,400 m 
2,000 m 
1,400 m 
2,000 m 
1,400 m 
2,000 m 
1,400 m 
2,000 m 
1,400 m 
2,000 m 
1,400 m 
2,000 m 
1,400 m 
2,000 m 


NA* 


0.52 


P 
0.026 
0.854 
0.282 
0.808 
0.082 
NA 
0.033 
0.614 
0.380 
0.031 
0.183 
0.467 
0.824 
0.173 
NA 
0.470 


n 
31 
59 
56 
59 
46 
NA 
49 
56 
14 
49 
40 
52 
16 
42 
NA 
48 


n block 
9 
10 
10 
10 
10 
NA 
10 
10 
7 
10 
10 
10 
8 
10 
NA 
10 


rwith survival 
-0.16 
-0.01 
-0.08 
-0.21 
0.13 
NA 
-0.25 
0.02 
-0.17 
-0.07 
-0.08 
0.04 
-0.20 
-0.12 
NA 
NA 
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rwith biomass 
0.08 
-0.27 
-0.29 
0.17 
-0.12 
NA 
-0.08 
0.00 
0.29 
0.24 
0.02 
0.00 
0.27 
-0.38 
NA 
0.20 


The effect of competitor community identity on the incidence of herbivory on focal species was assessed with mixed-effects models fitted separately for each species and site, including plot nested in block as 
random effects and log(initial size) as a fixed effect. Shown are likelihood ratio tests (d.f. = 1), and the total number of observations (n) and experimental units (n block) for each model. Also shown are correlations of 
the incidence of herbivory with biomass and survival after 2 years (P< 0.05 indicated in bold). No tests are significant after Hol m—Bonferroni correction. 


*Model could not be fitted because herbivory was constant. 
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A sexually dimorphic hypothalamic circuit controls 
maternal care and oxytocin secretion 


Niv Scott!, Matthias Prigge', Ofer Yizhar! & Tali Kimchi! 


It is commonly assumed, but has rarely been demonstrated’”, that sex 
differences in behaviour arise from sexual dimorphism in the under- 
lying neural circuits’*. Parental care is a complex stereotypic beha- 
viour towards offspring that is shared by numerous species’. Mice 
display profound sex differences in offspring-directed behaviours. 
At their first encounter, virgin females behave maternally towards 
alien pups while males will usually ignore the pups or attack them®”’. 
Here we show that tyrosine hydroxylase (TH)-expressing neurons in 
the anteroventral periventricular nucleus (AVPV) of the mouse hypo- 
thalamus are more numerous in mothers than in virgin females and 
males, and govern parental behaviours in a sex-specific manner. In 
females, ablating the AVPV TH™ neurons impairs maternal beha- 
viour whereas optogenetic stimulation or increased TH expression in 
these cells enhance maternal care. In males, however, this same neur- 
onal cluster has no effect on parental care but rather suppresses inter- 
male aggression. Furthermore, optogenetic activation or increased 
TH expression in the AVPV TH* neurons of female mice increases 
circulating oxytocin, whereas their ablation reduces oxytocin levels. 
Finally, we show that AVPV TH* neurons relay a monosynaptic 
input to oxytocin-expressing neurons in the paraventricular nucleus. 
Our findings uncover a previously unknown role for this neuronal 
population in the control of maternal care and oxytocin secretion, and 
provide evidence for a causal relationship between sexual dimorphism 
in the adult brain and sex differences in parental behaviour. 

The hypothalamus contains several sexually dimorphic nuclei, 
and has a critical role in coordinating sexual dimorphism in repro- 
ductive behaviours and physiological responses to environmental 
cues*’®, Among the sexually dimorphic hypothalamic nuclei, the 
AVPV (Fig. 1a) is unique as it possesses several female-biased sexually 
dimorphic characteristics”"’”’, including a markedly larger number of 
tyrosine hydroxylase (TH)-immunoreactive neurons in females than 
in males*'*. The presence of TH in these cells suggests that they are 
dopaminergic, but this has not been tested directly. 


Merge 


“tf 


Figure 1 | TH expression in the AVPV is sexually dimorphic and enhanced 
in postpartum females. a, Schematic drawing of mouse AVPV (top) and 
confocal images of a coronal brain section immunostained for TH (bottom). 
Inset shows higher magnification image of the AVPV region. Scale bar, 

1mm. b, AVPV ina coronal slice from a female mouse immunostained 
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We first characterized the TH* AVPV neurons in adult mice and 
found that 93% (+1%, n = 6 mice) of them co-express DOPA decar- 
boxylase (DDC), the enzyme responsible for the formation of dopamine 
from DOPA (3,4-dihydroxyphenylalanine) (Fig. 1b). Since dopamine 
signalling is known to enhance mother-pup interactions®*"*, we 
hypothesized that TH’ AVPV neurons are part of a sexually dimorphic 
circuit that underlies sex differences in parental care. We first tested 
whether parenthood is associated with changes in the number of TH~ 
AVPV cells. In sexually naive (virgin) females, the number of TH- 
positive neurons was significantly higher than that in virgin males 
(Fig. 1c, d), consistent with previous work"*"*. Surprisingly, we further 
found that in postpartum females this number was significantly higher 
than in virgin females (725 + 21 and 493 + 60, respectively; P< 0.001), 
whereas no such differences were observed between parental and virgin 
males (271 + 21 and 273 + 19, respectively; Fig. 1c, d). 

To assess the functional relationship between sexual dimorphism 
in TH* AVPV neurons and sex differences in parental behaviour, 
we employed three complementary cell-type-specific strategies to selec- 
tively manipulate TH’ AVPV neurons in adult males and females: 
selective ablation, overexpression of TH and optogenetic activation 
(Fig. 2a-e). To ablate TH’ AVPV neurons, we bilaterally injected the 
neurotoxin 6-hydroxydopamine” into the AVPV _ of 
wild-type mice. This effectively ablated the majority of TH’ AVPV 
neurons, but did not affect major TH-expressing neuronal populations 
in other brain regions (Fig. 2a and Extended Data Fig. la—d). Virgin 
females with bilateral ablation of TH* AVPV neurons exhibited a pro- 
found reduction in maternal displays, including a significantly longer 
latency to retrieve foreign pups to the nest and a smaller number of pups 
retrieved compared to control littermates (Fig. 2f, g and Extended Data 
Fig. 2a, b). TH-ablated virgin females also exhibited a shorter duration of 
crouching over the pups and a shorter overall duration of maternal 
behaviour (Fig. 2h and Extended Data Fig. 2c, d). In postpartum females 
(mothers), TH* AVPV neuronal ablation induced similar deficits in 


Parents 
1,000 - Sed 


8004 


Virgins 
@ Parents 


600 4 


Number of TH* neurons in AVPV 


for TH (green) and DDC (red). Scale bars, 20 im. c, Immunostaining for TH in 
AVPV of female and male virgins and parents. Scale bars, 50 pm. d, Total 
numbers of TH™ neurons in AVPVs of virgin females, virgin males, postpartum 
females and newly parental males. Data are means ~ s.e.m., n = 5 per group; 
*** P< 0.001; two-way ANOVA with Fisher’s multiple comparisons. 


1Department of Neurobiology, Weizmann Institute of Science, 76100 Rehovot, Israel. 
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Figure 2| TH* AVPV neurons promote maternal behaviour in virgin and 
postpartum females. a, Coronal section of female AVPV immunostained for 
TH (green). Scale bars, 50 jm. b, Schematic drawing of TH overexpression 
vector (top) and confocal images of coronal sections of unilateral TH- 
overexpression in the AVPV (bottom). mRFP labels virally transduced neurons 
(red). TH-immunoreactive neurons are labelled in green. Scale bars, 50 um. 
c, Schematic drawing of AAV vector used to express ChR2 in TH* AVPV 
neurons (top), confocal image depicting ChR2—mCherry expression (red) and 
TH immunostaining (green) in AVPV neurons (bottom). Scale bar, 50 um. 
d, Schematic illustration of photostimulation of TH* AVPV neurons 
expressing ChR2 (top) and an example voltage recording of a ChR2-expressing 
AVPV neuron stimulated with light pulses at 1 Hz (bottom). Blue bars 
represent 10-ms light pulses (475 nm, 19 mW mm”); black dots indicate 
detected action potentials; green outlines represent action potentials that are 
time-locked to light pulses. e, Representative images from a TH-ChR2 

mouse and TH-EYFP (control) mouse photostimulated in the AVPV. 
c-Fos-immunoreactive cells are labelled in red (top). Scale bars, 20 um. Number 
of c-Fos-expressing cells in AVPV following optogenetic stimulation in TH- 
ChR2 and control mice (bottom) (ncnr2 = 6, Ncontrol = 6, ***P < 0.001, two- 
tailed Student’s t-test). f-h, Quantification of maternal behaviour of virgin 
females with TH-ablation (grey bars), TH-OE (red bars) and TH-ChR2 (blue 
bars) relative to control groups (TH-ablation, Nabiation = 13; Ncontrol = 13; TH- 
OE, Nor = 9% Meontrot = 10; TH-ChR2, nchpr = 12s Neontrot = 14; *P < 0.05, 
**P < 0.01, ***P < 0.001, Mann-Whitney U-test). i, Latency to retrieve the 1st 
pup of postpartum females with TH-ablation, TH-OE and TH-ChR2? relative to 
control groups (TH-ablation, Mablation = 11, Meontro! = 11; TH-OE, nox = 8, 
Neontrol = 7; TH-ChR2, nchro = 6 Mcontrot = 6; *P<0.05, **P < 0.01, Mann- 
Whitney U-test). Data are means + s.e.m. 


maternal behaviour, manifested as longer latencies to pup retrieval 
(Fig. 2i and Extended Data Fig. 2e). The AVPV was reported to play a 
role in the regulation of GnRH neurons (known to control ovulation)"*”. 
However, we found that this TH* -specific ablation in the AVPV did not 
cause marked changes in female sexual behaviour, oestrous cycle and 
reproductive success (Extended Data Fig. 3). 

The expression level of TH in some dopaminergic neurons is activity- 
dependent and modulated by sensory stimuli’. To examine whether 
elevated TH expression levels in TH* AVPV neurons are causal in 
driving maternal behaviour, we overexpressed TH specifically in TH* 
AVPV neurons of TH-Cre females using a Cre-dependent TH-expres- 
sing adeno-associated virus (AAV) vector (TH-OE; Fig. 2b). In contrast 
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Figure 3 | TH* AVPV neurons suppress inter-male aggression but do not 
regulate parental behaviour in males. a, d, Quantification of pup-directed 
behaviours of virgin males with TH-ablation, TH-OE and TH-ChR2 relative to 
control groups. b, c, Pup retrieval of paternal males with TH-ablation, TH-OE 
and TH-ChR2 relative to control groups. e, f, Quantification of inter-male 
aggression in the resident-intruder test. Data are means + s.e.m., TH-ablation, 
= 12, Neontrot = 12; TH-OE, nop = 10, Neontrol = 10; TH-ChR2, 
8, Ncontrol = 73 *P < 0.05, Mann-Whitney U-test. 
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to the effect of TH® cell ablation, TH-OE promoted maternal behaviour 
in both virgin and postpartum females (Fig. 2f-i). TH-OE virgin females 
presented significantly shorter latencies to pup retrieval, and retrieved 
more pups to the nest compared to virgin females injected with a control 
mCherry-expressing AAV vector (Fig. 2f, g and Extended Data Fig. 4a, b). 
Similarly, TH-OE in mothers induced shorter latencies to pup retrieval 
(Fig. 2i and Extended Data Fig. 4e). This suggests that maternal pup 
retrieval is facilitated by increased TH levels in TH* AVPV cells. 

To determine whether activation of the TH* AVPV circuit can 
promote maternal behaviour, we injected virgin TH-Cre females with 
a Cre-dependent channelrhodopsin-2 AAV vector into the AVPV 
(TH-ChR2; Fig. 2c). We first recorded in acute brain slices from 
ChR2* and ChR2™ neurons in the AVPV to verify effective photo- 
stimulation of ChR2-expressing cells. The spontaneous firing rates of 
TH AVPV neurons were similar to those described previously for 
AVPV neurons”, and their intrinsic electrophysiological properties 
resembled those of midbrain dopaminergic neurons” (Fig. 2d and 
Extended Data Fig. 5a-j). We also confirmed in vivo activation of these 
neurons using c-Fos staining (Fig. 2e). Tonic 1 Hz photostimulation of 
TH AVPV neurons in virgin females elicited a shorter latency to 
retrieve pups to the nest and increased the number of pups retrieved 
(Fig. 2f, g; Extended Data Fig. 6a, b and Supplementary Videos 1, 2). 
Moreover, photostimulation promoted crouching behaviour and pro- 
longed the overall duration of maternal care (Fig. 2h and Extended 
Data Fig. 6c, d). Similarly, activation of TH* AVPV neurons in 
mothers promoted pup retrieval to the nest, but did not affect maternal 
aggression, an adult-directed aggressive behaviour in lactating dams 
(Fig. 2i and Extended Data Fig. 6e-g). In all of the TH-manipulated 
groups, we verified that the observed effects on maternal behaviour are 
not accompanied by changes in anxiety, exploration, or locomotor 
activity (Extended Data Fig. 7a-l). These findings indicate that TH* 
AVPV neurons are required for the control of maternal behaviour in 
virgin and postpartum females. 
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We next examined whether TH” AVPV neurons are involved in the 
regulation of parental behaviour in males. To address this, we manipu- 
lated TH* AVPV neurons in males using the same TH-specific strat- 
egies. In contrast to the robust effects seen in females, neither ablation 
nor TH-overexpression or optogenetic activation of TH* AVPV neu- 
rons induced significant changes in parental behaviour in virgin and 
parental (fathers) males (Fig. 3a—c, Extended Data Figs 2f-h, 4f-h and 
6h-j). Notably, however, ablation of TH* AVPV neurons in virgin 
males induced a trend towards increased pup-directed aggression 
(P = 0.09; Fig. 3d and Extended Data Fig. 2h). To further assess whether 
TH’ AVPV cells contribute to the regulation of male aggression, we 
tested the effects of our TH-manipulations on inter-male aggression by 
using the resident-intruder assay‘. We found that males in which TH” 
AVPV neurons were ablated presented an increase in attack bursts 
and in the total duration of aggression (Fig. 3e, f and Extended Data 
Fig, 2i, j). In contrast, optogenetic activation of TH* AVPV neurons led 
to a significant reduction in these behavioural measures (Fig. 3e, f 
Extended Data Fig. 6k, | and Supplementary Videos 3, 4). TH-manipu- 
lated males displayed no changes in anxiety, exploration or locomotor 
activity (Extended Data Fig. 8a-j). We conclude that in males, TH* 
AVPV neurons act as negative regulators of inter-male aggression but 
do not contribute to regulation of parental care. 

Several hormones, including oestradiol, corticosterone, prolactin 
and oxytocin, have been implicated in the regulation of sex-specific 
parental behaviour’®**. This could suggest that the regulation of 
maternal behaviour by TH” AVPV neurons is associated with the 
action of these hormones. We measured changes in the peripheral 
levels of these hormones in the absence of pups (Fig. 4a and Extended 
Data Fig. 9a-e). We found that oxytocin (OT) levels, but not the 
levels of other hormones, were reduced following TH* AVPV neur- 
onal ablation and elevated in TH-OE virgin females compared to 
controls (Fig. 4a and Extended Data Fig. 9b, d, e). Remarkably, we 
further found that optogenetic stimulation of TH’ AVPV neurons at 
1 Hz for 10 min was sufficient to induce a significant increase in OT 
levels in ChR2-expressing virgin females (Fig. 4a). In contrast, none 
of the TH’ AVPV manipulations induced significant changes in 
circulating OT levels in virgin males (Extended Data Fig. 9a). TH* 
AVPV neurons thus seem to directly regulate the release of circulat- 
ing OT in females. 

We next explored the anatomical and functional connectivity 
between TH* AVPV neurons and OT-secreting neurons in the para- 
ventricular nucleus (PVN) and the supraoptic nucleus (SON)”*. We 
fluorescently labelled the axonal projections of the TH* AVPV neu- 
rons by injecting a Cre-dependent enhanced yellow fluorescent pro- 
tein (EYFP)-expressing viral vector into the AVPV of TH-Cre mice. 
Projection analysis revealed several brain regions containing fluores- 
cently labelled TH" fibres in both males and females (Extended Data 
Fig. 10a-d). Among the brain regions that showed the densest 
projections were the medial preoptic area, a region associated with 
parental behaviour®’’**, and the PVN. Immunostaining for OT in 
the PVN revealed dense TH’ AVPV fibres in close proximity to 
OT* PVN neurons (Fig. 4b). Additionally, we injected a Cre-depend- 
ent anterograde viral tracer (H129ATK-TT) into the AVPV of TH-Cre 
females. Immunohistochemical staining for OT in the PVN showed 
that 55 + 7% of OT™ neurons co-labelled with the anterograde virus 
(Fig. 4c, e). In contrast, similar staining in the SON did not reveal any 
virally-transduced OT neurons (Fig. 4d, e). 

To test for a functional synaptic connection between TH* AVPV 
neurons and OT* PVN cells, we obtained whole-cell patch clamp 
recordings from PVN OT™ cells and photostimulated ChR2* axonal 
projections from the AVPV. For these recordings, we labelled OT* 
PVN neurons of TH-Cre females with an AAV vector encoding the 
Venus fluorophore under an OT promoter. A second AAV, encoding a 
Cre-dependent ChR2-mCherry fusion protein, was injected into 
the AVPV. Acute slices prepared from these mice showed 
ChR2-mCherry-labelled fibres throughout the region containing 
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Figure 4 | TH* AVPV neurons form synaptic connections with OT* PVN 
neurons and regulate OT secretion in females. a, Plasma OT levels in TH- 
ablation, TH-OE and TH-ChR2 females relative to control groups (data are 
means = s.e.m., TH-ablation, naplation = 85 Mcontrol = 5; TH-OE, nog = 9, 
Neontrol = 9; TH-ChR2, nichro = 7s Neontrol = 8; *P < 0.05, ***P < 0.001, two- 
tailed Student’s t-test). b, Coronal sections of the PVN in TH-Cre female 
injected unilaterally with Cre-dependent EYFP vector into the AVPV. 
Anterograde fibres projecting from TH* AVPV neurons (green) lie in the 
vicinity of OT™ neurons in the PVN (red). Scale bars, 50 um (top) and 10 pm 
(bottom). c, Coronal sections of TH-Cre female injected with Cre-dependent 
anterograde tracer (H129ATK-TT) into the AVPV, showing virally labelled 
OT* neurons in the PVN. Arrows indicate virally transduced neurons 

(green pseudocolour) that are OT* (red pseudocolour). Scale bars, 10 jum. 

d, Coronal section of the SON of TH-Cre female injected with Cre-dependent 
anterograde tracer into the AVPV, showing virally labelled neurons (green) 
and OT* neurons (red). Scale bar, 10 um. e, Percentage of virally transduced 
OT™ neurons in the PVN and SON (PVN: 55 + 7%, SON: none, n = 3 mice). 
f, Illustration of the experimental approach. Top, schematic drawing of a 
mouse brain in a sagittal view injected with AAV-DIO-ChR2-mCherry into the 
AVPV and with AAV-OT-Venus into the PVN. Bottom, schematic drawing 
and confocal image of PVN coronal section showing the OT* neurons 
transduced with AAV-OT-Venus (green) and fibres from TH* AVPV neurons 
expressing ChR2-mCherry (red). Scale bar, 10 tum. g, Light-evoked responses 
in OT* cells recorded in voltage clamp mode at the specified holding potentials. 
h, Average light response (black) overlays 50 single-trial responses (grey) of a 
representative OT™ cell clamped to —70 mV. i, Distribution of 
light-induced response latencies from 50 light stimuli recorded in a single OT* 
cell (median = 3.77 ms from light onset). SON, supraoptic nucleus; PVN, 
paraventricular nucleus. 


OT™ neurons, confirming the results of our viral tracing experiments 
(Fig. 4f). Photostimulation of TH-ChR2 fibres in the PVN evoked fast 
excitatory post-synaptic currents in 7 out of 19 OT* PVN neurons 
recorded (3.81 + 0.83 ms latency from blue light onset; n = 7 cells 
from 3 mice; Fig. 4g-i). These results indicate that a monosynaptic 
connection exists between the TH” AVPV neurons and OT PVN 
cells, suggesting that TH* AVPV neurons can facilitate OT release 
from OT" PVN neurons. 

Our study reveals that a female-biased cluster of TH-expressing 
neurons in the AVPV has an essential role in the control of sex-specific 
behaviours in both sexes. Whereas in females this neural population 
acts to promote maternal care and OT secretion, in males it is rather 
involved in the suppression of adult-directed aggression. Moreover, we 


24 SEPTEMBER 2015 | VOL 525 | NATURE | 521 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


have uncovered a novel monosynaptic circuit linking the TH* AVPV 
neurons with PVN OT* neurons. In rodents, as in many mammals, 
OT signalling is necessary for lactation and has been implicated as a 
central neuroendocrine mediator of maternal behaviour***”’”*. Based 
on our findings and on these previous reports, we propose that in 
females, pup-mediated signals trigger the activation of TH AVPV 
neurons, which in turn activate oxytocinergic neurons in the PVN, 
releasing OT into the brain and periphery and facilitating instinctive 
parental behaviours (Extended Data Fig. 10e). 

Finally, although manipulation of TH* AVPV neurons markedly 
altered sex-specific behaviours in both males and females, the beha- 
viours displayed by manipulated animals of both sexes did not exceed 
the boundaries of sex-typical behaviours. These data highlight the 
critical role of intrinsic sex differences in the brain in setting the dis- 
tinct behavioural repertoire displayed by males and females. Within 
these boundaries, changes in the activity of sexually-dimorphic neural 
circuits allow dynamic modulation of adaptive behavioural responses 
to sex-specific challenges. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Animals. Male and female mice (12 to 20 weeks old) were used in all experiments. 
For TH-ablation experiments wild-type mice (CD-1-Harlan Laboratory) were 
used. For TH-overexpression, TH* optogenetic stimulation and tracing experi- 
ments, TH-IRES-Cre*! male and female mice were used. Littermates were ran- 
domly assigned to experiment or control groups. We confirmed the specificity of 
Cre-expressing neurons in virgin females by showing that a majority of Cre-expres- 
sing cells overlap with TH-immunoreactive cells in the AVPV (EYFP (Cre) cells 
expressing TH = 80 + 3%; TH cells expressing EYFP (Cre) =91+3%, n=6 
mice). Compared with reports of reduced specificity of this mouse line to dopamine 
neurons in the ventral tegmental area”, these data indicate that specificity in the 
AVPV is relatively high. Furthermore, increased TH expression in maternal 
females might further increase the specificity values. All mice were bred and housed 
in a specific-pathogen-free animal facility; they were maintained on a reverse 
12:12 h light-dark cycle with food and water ad libitum. All experimental proce- 
dures were approved by the Institutional Animal Care and Use Committee 
(IACUC) at the Weizmann Institute of Science. 

Viral vectors. pAAV5-Eflo-DIO-ChR2(E123T/T159C)-mCherry, pAAV5- 
EFlo-DIO-EYFP and pAAV5-Eflo-DIO-mCherry were acquired from UNC 
Gene Therapy Center. H129-ATK-TT virus* was a gift of D. J. Anderson and 
AAV-OTpr-Venus virus** was a gift of V. Grinevich. A Cre-dependent vector for 
TH-overexpression (pAAV-EFlo-DIO-TH-p2A-mRFP) was produced using 
overlap extension PCR. A 2A peptide was inserted between TH and mRFP coding 
sequences to allow unperturbed function of TH. The forward and reverse primer 
for the entire PCR fragment contained an AscI and Nhel restriction sites. The 
resulting amplicon was digested and cloned into the pAAV-EF 1a-DIO backbone”. 
6-OHDA. 6-hydroxydopamine hydrochloride (Sigma-Aldrich) was dissolved in 
0.1% ascorbic acid saline solution to final concentration of 6 pg pl ' and prepared 
fresh every day before injections into the AVPV. 

Surgery. Mice were anaesthetized with isoflurane and mounted on a stereotaxic 
frame (myNeuroLab). Virus or agent was bilaterally or unilaterally injected into 
the AVPV (for behaviour or tracing, respectively) using a Hamilton syringe at an 
injection rate of 0.05-0.15 pl min’. AVPV injection coordinates were anteropos- 
terior (AP): 0.25mm, mediolateral (ML): +0.15mm, dorsoventral (DV): 
—5.45 mm. Injection volumes were: 0.2 ul for TH-OE and ChR2 viruses; 0.06 il 
and 0.1 yl for AAV-EYFP and H129ATK-TT tracing experiments, respectively; 
1 ul for 6-OHDA. AAV-OTpr-Venus was bilaterally injected into the PVN (AP: 
—0.8mm, ML: +0.2, DV: —4.75) at a volume of 0.3 pl. ChR2 animals used for 
behavioural testing were implanted with 200 pm fibre-optic cannulae (Thorlabs). 
Fibres were located medially and above the AVPV of both hemispheres (Fig. 2d). 
The cannula was secured using dental cement. Mice were left to recover for at least 
3 weeks before behavioural assays. 

Behavioural assays. All behavioural tests were performed during the dark phase 
and under dim red light. Behaviours were recorded using a digital video recording 
unit and scored using Observer XT or EthoVision softwares (Noldus Information 
Technology) by an individual blind to the identity of the manipulated groups. 
Parental behaviour. Males and females were tested for pup-directed behaviours 
using the exact same behavioural assay procedure. Mice were individually housed 
with nesting material (cotton) 24h before the trial. Three newborn alien pups 
(1-3 days old, CD-1, Harlan Laboratories) were placed on the opposite side to the 
sleeping nest in the resident mouse’s home cage and behaviour was recorded for 
15min. Females were introduced to pups for 3 consecutive days and parental 
behaviours were scored on day 3 when animals exhibited the highest level of 
parental behaviour. Pup-directed behaviours of males were scored on day 1 since 
their behaviour did not change with repeated exposure to pups. The scored beha- 
vioural parameters were (Supplementary Videos 1, 2): pup retrieval, carrying the 
pups to the cotton nest; crouching, covering the pups to maintain body temper- 
ature and (in females) lactating behaviour; nesting, building a nest around the 
pups; licking, licking the pups; attack, pup-directed attack; ignore, failing to 
approach or make contact with the pups. Parental duration was taken as the 
sum of all parental behaviour durations in the assay: crouching, nesting and 
licking. Pup retrieval was scored in points and quantified as a cumulative retrieval 
score in which each successful pup retrieval to the nest contributed 1 point. Pup- 
directed aggression (attack score) was quantified similarly, with each pup attack 
contributing 1 point. Parental behaviours in postpartum females were tested on 
postpartum day 4. For this assay, first all newborn pups were removed from the 
home cage of each female mouse, and then after 10 min, three of the female’s own 
pups were placed back in the resident’s home cage on the opposite side to the 
breeding nest for 10 min. Latencies to retrieval of the pups back to the nest were 
scored as described above. Fathers’ parental behaviour was measured in males that 
were housed together with females for 21-24 days (that is, separated from the 
lactating females and their own pups 1-3 days after the delivery of the pups). 
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Maternal aggression. Postpartum females were tested for maternal aggression 
8 days post-delivery. All pups were removed from cage and adult intruder male 
was immediately introduced to female cage for 10 min assay. Females were scored 
for the duration and number of their aggressive attacks towards the intruder male. 
Female sexual behaviour. For 4 consecutive days oestrus cycle state was deter- 
mined (see below). Only females confirmed to be in oestrus were introduced with 
an experienced male for 15 min assay. Female receptivity was scored as lordosis, 
successful male mountings and male rejections. 

Male-male interactions. Males were separated to their home cages and intro- 
duced to 6 to 7 week-old C57B1/6 unfamiliar male intruders for 10 min. The male 
intruders were swabbed with 60 urine collected from sexually mature and 
experienced male mice. All experiments were videotaped and the behaviours of 
the resident animals were scored for the following parameters: number of attacks, 
latency to attack and total duration of aggression (Supplementary Videos 3, 4). 
Open-field assay. The test was performed in a white Plexiglas box (50 X 50 x 40 
cm) with an overhead lamp directed to the centre of the field, providing 120 lx of 
illumination on the floor. Each mouse was placed in the corner of the apparatus 
and its locomotion pattern was recorded for 10 min. 

Elevated plus maze assay. The test was performed using a polyvinyl chloride maze 
comprising a central part (5 X 5 cm), two opposing open arms (30.5 X 5 cm), and 
two opposing closed arms (30.5 X 5 X 15cm). The apparatus was raised to a 
height of 53.5 cm, and the open arms were provided with 6 Ix of illumination. 
Each mouse was placed in the centre facing an open arm and its locomotion was 
recorded for 5min. The distance travelled and time spent in the open arms 
was measured. 

ChR2 mediated in vivo photostimulation. Prior to all behavioural tests, animals 
were connected to optical fibres with a 200-1m silica core (BFL37-200; Thorlabs) 
for 5-10 min habituation. To enable free movement during the test, we connected 
the optical fibres to an optical rotary joint (Doric Lenses QC, Canada). Light 
stimulation at 473 nm was provided by a DPSS laser (CrystaLaser) and lasted 
for the entire duration of the assay. Photostimulation of TH’ AVPV neurons 
was most effective at low frequencies (<10 Hz) and its efficacy declined at higher 
frequencies (Extended Data Fig. 5b). Photostimulation parameters in all assays 
were 1 Hz frequency, 10 ms pulse-width and 95 mW mm * light power density. 
Laser pulses were driven by a 33220A Function Waveform Generator (Agilent 
Technologies, Israel). For c-Fos verification, mice were stimulated with a train 
of 473 nm light (5Hz, 10ms) for 30min. Mice were euthanized 60 min post- 
stimulation and brains were processed for immunohistochemistry. The activation 
frequency of 5 Hz was chosen as we observed efficient excitation of the neurons 
using this frequency in the acute slice preparation. 

Slice electrophysiology recording. Four to eight weeks after virus injection, 
300-j1m-thick coronal sections of the AVPV and PVN were prepared using a 
vibratome (Leica TV1200S) in ice-cold sucrose cutting solution (in mM: 11 
D-glucose, 234 sucrose, 2.5 KCl, 1.25 NaH PO, 10 MgSO,, 0.5 CaCl, 26 
NaHCO3;) oxygenated with 95% O2/5% COs. Slices were then incubated at 
32°C for 30 min in high-osmolarity artificial cerebrospinal fluid (aCSF; in mM: 
3.24 KCl, 11.88 glucose, 132.8 NaCl, 28.1 NaHCOs, 1.35 NaH2POu,, 1.08 MgCh, 
2.16 CaCl; 320 mOsm kg, aerated with 95% O,/5% CO) and another 30 min in 
iso-osmotic aCSF (in mM: 3 KCl, 11 glucose, 123 NaCl, 26 NaHCOs, 1.25 
NaHPO,, 1 MgCl, 2 CaCl,; 300 mOsm kg, aerated with 95% O,/5% CO,) at 
32 °C. Following recovery, slices were kept at room temperature until use. The 
recording chamber was perfused with oxygenated aCSF ata rate of 1.5-2 ml min” * 
and maintained at 32 °C. Borosilicate glass pipettes (Sutter Instrument BF100-58- 
10) with resistances ranging from 4-6 MOQ were pulled using a laser micropipette 
puller (Sutter Instrument Model P-2000) and filled with intracellular solution (in 
mM: 135 K-gluconate, 4 KCl, 2 NaCl, 10 HEPES, 4 EGTA, 4 MgATP, 0.3 NaTRIS, 
280 mOsm kg’ *, pH adjusted to 7.3 with KOH). Recordings in PVN OT cells were 
performed with an intracellular solution (in mM: 120 CS-gluconate, 11 CsCl, 1 
MgCl, 1 CaCl,, 10 HEPES, 11 EGTA, 5 QX-314, 280 mOsm kg ~ ‘ pH adjusted to 
7.3 with CsOH). Neurons were patched under visual guidance using infrared 
differential interference contrast (DIC) microscopy (Olympus BX51WIF) and 
an Andor Clara CCD camera. OT cells were identified based on Venus fluor- 
escence. Recordings were carried out using a Multiclamp 700B amplifier (Axon 
Instruments). Optical activation of ChR2-expressing neurons was performed using 
475/28 nm and 10 ms light pulses at 19 mW mm * (Lumencor Spectra-X) deliv- 
ered through the microscope illumination path. 

Oestrus cycle analysis. Oestrous cycle stage of females was determined for 9 
consecutive days. Each day at 10am a vaginal smear was collected from all females, 
stained with Dip Quick stain kit (Jorgensen Laboratories, Inc.) and analysed for 
oestrus under a light microscope. The stage of the oestrus cycle was determined as 
previously described*. 

Hormone level analysis. Upon completion of behavioural assays, mice were 
anaesthetized with isoflurane and blood samples were collected from the orbital 
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sinus. The blood was centrifuged and the supernatant was collected and stored at 
—80 °C. Plasma oxytocin, corticosterone, oestradiol and prolactin levels were mea- 
sured using ELISA kits according to the manufacturers’ protocols (Enzo Life 
Sciences - ADI-900-153 for oxytocin, Cayman Chemical Company - 500655 for 
corticosterone and 582251 for oestradiol, abcam - ab100736 for prolactin). 
Immunohistochemistry. Following behavioural assays and blood collection the 
mice were euthanized, perfused with 4% paraformaldehyde (PFA), and their 
brains were sectioned with a vibratome (Leica Microsystems) into 30-50 im 
coronal slices. Floating brain slices were collected, washed three times in PBS, 
and immunostained for tyrosine hydroxylase (TH), DOPA decarboxylase 
(DDC), GFP, oxytocin (OT) or c-Fos using the following protocol. For 24-48h 
at 4°C, slices were incubated in blocking solution: 10% to 20% normal human 
serum (NHS), 0.04% Triton; carrier: 1% NHS, 0.03% Triton; primary antibodies: 
rabbit/sheep anti-TH (1:1,000, Millipore), rabbit anti- DDC (1:1,000, Novus), goat 
anti-GFP biotinylated (1:200-1,000, Abcam), rabbit anti-c-Fos (1:1,000, Santa 
Cruz) and guinea-pig/rabbit anti-OT (1:1,000, Peninsula Laboratories LLC) for 
24—48h at 4°C; secondary antibodies: Cy3-goat anti-rabbit, Cy5-donkey anti- 
guinea pig (1:200, Jackson ImmunoResearch), Alexa Flour 488-conjugated strep- 
tavidin, Alexa Flour 488/594 anti-rabbit, Alexa Flour 488/594 anti-sheep (1:200, 
Molecular Probes). 

Image analysis and cell counting. All brain slices were imaged by epifluorescence 
microscopy (Nikon, Eclipse 80i) or by confocal microscopy (Zeiss, LSM 710) for 
subsequent analysis. Brain areas were determined according to their anatomy 
using Franklin and Paxinos Brain Atlas. For AVPV TH* cell counts the entire 
AVPV was sliced, stained and counted. For c-Fos, anterograde tracing, viral infec- 
tion and 6-OHDA lesion specificity experiments, cell counts were performed on 
selected brain slices, chosen in each animal according to standard anatomical 
markers. All counts were done manually by experimenter blind to test conditions. 
Anterograde tracing. Mice injected unilaterally with AAV-DIO-EYFP virus were 
perfused 5 weeks post-injection, and their brains were sectioned coronally (50 um 
slices) and mounted on slides. Each brain area in which EYFP-immunoreactive 


fibres were detected was scanned by confocal microscope under uniform imaging 
settings (LSM 710, Zeiss, Germany). Representative brain images were analysed 
for projection intensity using ImageJ software. For trans-synaptic anterograde 
tracing, females injected unilaterally with H129ATK-TT were perfused 5 to 7 days 
post-injection as previously described’*. The brains were sectioned into 50 um 
coronal slices, and these were mounted on slides and analysed to verify the injec- 
tion site. Brain slices from the PVN and the SON were immunostained with anti- 
oxytocin antibody and imaged by confocal microscopy for further analysis of 
colocalization with dtTomato-labelled virally transduced neurons. 

Statistical analysis. Sample size was determined according to the accepted prac- 
tice for behavioural assays, but no statistical methods were used to predetermine 
sample size. All data are expressed as means + s.e.m. Behavioural assays were 
analysed using the non-parametric Mann-Whitney U-tests. Two-way ANOVA 
with post-hoc tests or two-tailed student’s t-tests were used for statistical evalu- 
ation of immunostaining assays, tracing experiments and hormone analyses. 
Outlier samples (more than 2S.D. from the mean in >2 parameters) were excluded 
from experiment. All statistical tests were performed using STATISTICA software 
(StatSoft, Tulsa, OK). 
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Extended Data Figure 1 | 6-OHDA injection into AVPV of male and female 
nice results in specific ablation of TH? neurons. a, TH immunostaining in 
AVPVs of females and males injected with 6-OHDA (TH-ablation) or saline 
(control). Scale bars, 20 um. b, Number of TH* AVPV neurons in TH-ablation 
and control females and males (females, n744-ablation = 12; Mcontrol = 13; males, 
Nablation = 9 Ncontrol = 8; ***P < 0.001, two-way ANOVA with Fisher’s 
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multiple comparisons). c, TH immunostaining in brain slices from females 
injected with 6-OHDA or saline into the AVPV. Scale bars, 20 jum. 

d, Number of TH-immunoreactive neurons in TH-expressing brain areas 
(Nablation = 5» Ncontrol = 5). Data are means + s.e.m. AVPV, anteroventral 
periventricular nucleus, ARC, arcuate nucleus, SN, substantia nigra, VTA, 
ventral tegmental area. 
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Extended Data Figure 2 | TH* ablation in AVPV impairs maternal 
behaviour and increases inter-male aggression. a—d, Maternal behaviour of 
virgin females in TH-ablation and control groups (Nabiation = 13; Mcontrol = 13). 
e, Pup retrieval of postpartum females in TH-ablation and control groups 


(Naplation = 11, Ncontrot = 11). f-h, Pup-directed behaviours of virgin (g, h) and 
parental (f) males in TH-ablation and control groups. i, j, Inter-male aggression 
in TH-ablated and control groups (“ablation = 10, Mcontroi = 10). Data are 
means + s.e.m. *P < 0.05; **P < 0.01, Mann-Whitney U-test. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Female sexual behaviour 


a b Cc d 
100 30 20 8 oOControl 
ao 
7) HTH-ablation 
8 an 16 
= = 20 6 
D 2 7) o 
5 8 § 12 2 
5 3 15 rs) c 4 
Be 2 oD 8 fe) 
o 6 10 ac = 
E 5 2 
° 
2 7% 
x 
0) 
Control TH-ablation a a 
Oestrous cycle 
e 7 . 
Control TH-ablation Cycle stage 


1234567 89 
Day 


1 


P = Proestrous 
E = Oestrous 
D = Diestrous 
M = Metestrous 


234567 8 9 
Day 


Female reproductive success 


—n 
{fo} 


100 7 
~~ 6 
= 80 
2 @° 
o 5 
g 60 24 
a Q 
3 40 a3 
e 29 
D 3 
gs 20 
o 1 

0 0 
Control TH-ablation 


Extended Data Figure 3 | TH™ ablation in female AVPV does not affect 
sexual behaviour and reproduction. a-d, Female sexual behaviour in TH- 
ablation (6-OHDA) and control (saline) groups (Mablation = 7> Mcontrol = 7): 

a, Percentage of females displaying lordosis behaviour. b, Total duration 

of lordosis behaviour. c, Number of defensive rejections of the intruder male by 
the subject females. d, Successful sexual mounting events of the intruder 


OControl 
1.5 BITH-ablation 


Pup weight (gr) 


male on the subject females. e, Oestrus cycles in TH-ablation and control 
females (ablation = 8, Mcontrol = 8). f-h, Female reproductive success in TH- 
ablation and control groups. f, Gestational success in percentage after 
copulation with males (Nablation = 145 Mcontrol = 14). Litter size (g) and pup 
weight (h) of TH-ablation and control females (“ablation = 12, Mcontro! = 12). 
Data are means + s.e.m. 
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Extended Data Figure 4 | TH overexpression in TH* AVPV neurons Ncontrol = 7). £-h, Pup-directed behaviours in TH-OE and control virgin 
increases maternal pup retrieval. a—d, Maternal behaviour in TH- (g, h) and paternal (f) males. i, j, Inter-male aggression in TH-OE and control 
overexpression (TH-OE) and control virgin females (nog = 9; Ncontrol = 10). males (nog = 10, Neontro| = 10). Data are mean + s.e.m. *P < 0.05; **P< 0.01, 
e, Pup retrieval in TH-OE and control postpartum females (nog = 8, Mann-Whitney U-test. 
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Extended Data Figure 5 | Intrinsic electrophysiological properties of THt 
AVPV neurons. Whole-cell recordings were performed in acute coronal slices 
from TH-Cre mice co-injected with DIO-EYFP and DIO-ChR2(E123T/ 
T159C)-mCherry viral vectors. Cells were identified based on EYFP expression 
and recorded in current-clamp mode. Cells were then classified as ChR2* or 
ChR2~ based on the presence or absence of a direct, short-latency (<1 ms) 
light-evoked photocurrent response. a, Differential interference contrast (top) 
and mCherry fluorescence (bottom) images of a TH* AVPV cell expressing 
ChR2-mCherry. Scale bar, 20 jum. b, Light-evoked spiking fidelity in TH* 
AVPV neurons across varying light pulse frequencies (ChR2™, 1 = 12 cells; 
ChR2_, n = 10 cells). Light pulse trains containing 20 pulses (10 ms, 

19mW mm °, 475 nm) at each frequency were used to calculate response 
rates. Only spikes that occurred within 10 ms of light onset were calculated 


as direct responses. Apparent responses in ChR2 cells are attributed to the 
ongoing spontaneous firing of these neurons. c, Current clamp recording 

of voltage responses to negative (100 pA, red) and positive (50 pA, black) 
current injections inan AVPV TH” neuron. d-i, Intrinsic electrical properties 
of TH*/ChR2* (blue bars) and TH*/ChR2~ (white bars) cells, calculated from 
responses to current injections as shown in c: input resistance (d), spontaneous 
action potential firing rate (e), width of action potentials at half-maximum 
(f), resting membrane potential (g), action potential threshold (h) and 
membrane time constant (i). All showed no marked difference between ChR2~ 
and ChR2 cells (ChR2*, n = 8; ChR2,n = 4). j, Action potential firing rates 
of TH*/ChR2* cells recorded in whole-cell patch clamp mode before, during 
and after 1 Hz optogenetic stimulation (data are means + s.e.m., ***P < 0.05, 
paired t-test, n = 7 cells). 
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Extended Data Figure 6 | Optogenetic activation of TH* AVPV neurons 
increases maternal behaviour and reduces inter-male aggression. 

a-d, Maternal behaviour of virgin females during optogenetic activation in TH’ 
AVPV neurons (nchr2 = 12; Montro! = 14). e-g, Pup retrieval and maternal 
ageression of postpartum females during optogenetic activation in TH* AVPV 


neurons (chro = 6; Ncontrol = 6). h-j, Pup-directed behaviours of virgin (i, j) and 
paternal (h) males through optogenetic activation in TH AVPV neurons. 

k,], Inter-male aggression through optogenetic activation in TH* AVPV neurons 
(Nchr2 = 10, Ncontrol = 10). Data are means + s.e.m. #P = 0.09 (a) or 0.05 

(e), *P < 0.05; **P < 0.01, ***P < 0.001, Mann-Whitney U-test. 
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Extended Data Figure 7 | TH* AVPV neuronal manipulations do not affect TH-OE (middle) and TH-ChR2 (bottom) females relative to respective 


locomotion or anxiety in females. a-f, Open field assay. Total distance control groups (TH-ablation, Maplation = 13, Mcontro! = 13; TH-OE, nox = 10, 
travelled (left) and total visits to centre of the field (right) for TH-ablation (top), — Mcontrol = 10; TH-ChR2, ncnro = 7; Mcontrol = 9). TH-ChR2 mice and 
TH-OE (middle) and TH-ChR2 (bottom) females relative to respective EYFP-expressing controls were tested using attached fibre optics and blue 


control groups. g-l, Elevated plus maze assay. Total time spentintheopenarms __ light stimulation as described for maternal behaviour testing. Data are 
of the maze (left) and total visits to open arms (right) for TH-ablation (top), means + s.e.m. 
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Extended Data Figure 8 | TH* AVPV neuronal manipulations do not males relative to respective control groups. (TH-ablation, napiation = 12, 


affect locomotion or anxiety in males. a-d, Open field assay. Total distance — Meontrot = 12; TH-OE, nog = 10, neontroi = 10; TH-ChR2, ncnro = 4; 

travelled (left) and total visits to centre of the field (right) for TH-ablation (top) — ncontrol = 7). TH-ChR2 mice and EYFP-expressing controls were tested 

and TH-OE (bottom) males relative to control groups. e-j, Elevated plus with attached fibre optics and blue light stimulation as described for parental 
maze assay. Total time spent in open arms of the maze (left) and total visits to behaviour testing. Data are means + s.e.m. 

open arms (right) for TH-ablation (top), TH-OE (middle) and ChR2 (bottom) 
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Extended Data Figure 9 | Hormones levels in plasma of TH AVPV 
manipulated mice. a, OT levels in males (TH-ablation, napiation = 95 

N control — 9; TH-OE, Nor = 6, Ncontrol — 5; TH-ChR2, NchR2 = 7, Ncontrol = 8). 
b, Oestradiol levels in females (TH-ablation, nablation = 9; Ncontrol = 115 
TH-OE, nog = 6, Meontrol = 5; TH-ChR2, ncnre = 10, Meontrot = 10). 

c, d, Corticosterone levels in females (c) and males (d) (females: TH-ablation, 
Nablation — 6, "control — 5; TH-OE, Nor = 9, "control — 9 TH-ChR2, NChR2 = 9, 
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Ncontrol = 8; males: TH-ablation, Mablation = 45 Ncontrol = 43 TH-OE, nog = 8, 
Ncontrol = 8; TH-ChR2, ncnr2 = 7; Ncontrol = 7). €, Prolactin levels in females 
(TH-ablation, nabiation = 11, Meontrot = 10; TH-OE, nog = 5, Ncontrot = 53 TH- 
ChR2, nchr2 = 6; Mcontrol = 6). Data are normalized to matched control groups. 
No significant differences were found between the TH-manipulated and 
control groups in the presented parameters. Data are means + s.e.m. 
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Extended Data Figure 10 | TH* AVPV neuronal projection and suggested 
model by which TH* AVPV neurons promote maternal care. a, Coronal 
brain sections of mice unilaterally injected with a conditional EYFP-expressing 
viral vector (AAV-DIO-EYFP) into the AVPV of TH-Cre female mice. 
Projections from TH* AVPV neurons into various brain structures are 
presented. Scale bars, 1 mm (left panel) and 100 jm (right panel). 

b, Fluorescent intensities of EYFP-labelled projection fibres of TH’ AVPV 
neurons in different brain structures of TH-Cre females and males injected 
with a conditional EYFP-expressing viral vector (data are means + s.e.m., 
females, n = 3; males, n = 5, *P < 0.05, **P< 0.01, ***P < 0.001, two-tailed 
Student’s t-test). AVPV, anteroventral periventricular nucleus; POA, preoptic 
area; LS, lateral septum; BNST, bed nucleus of the stria terminalis; MPOA, 
medial preoptic area; SON, supraoptic nucleus; PVN, paraventricular nucleus; 


LETTER 


DM, dorsomedial nucleus; LHA, lateral hypothalamic area; PAG, 
periaqueductal grey; ARC, arcuate nucleus. c, Schematic illustration of 
projections from TH* AVPV neurons of adult females, in a transverse view. 
Arrow thickness indicates projection density, measured as fluorescent intensity 
of fibres labelled with EYFP. d, Specificity of Cre-dependent EYFP 
expression. Image showing a coronal section from a TH-Cre mouse injected 
with AAV-DIO-EYFP into the AVPV. Images show colocalization of EYFP in 
TH™ neurons (green) and immunostaining for TH (red). Scale bars, 50 um. 
e, Suggested model by which TH* AVPV neurons promote female-typical OT 
release and maternal behaviour. (1) Pup-related sensory signals induce changes 
in the activity of TH™ AVPV neurons; (2) activated TH* AVPV neurons 
stimulate OT* PVN neurons; (3, 4) OT* PVN neurons secrete OT into central 
nervous system and blood; (5) maternal behaviour is facilitated. 
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Cell-fate determination by ubiquitin-dependent 


regulation of translation 


Achim Werner, Shintaro Iwasaki’, Colleen A. McGourty’, Sofia Medina-Ruiz’, Nia Teerikorpi”, Indro Fedrigo’, 


Nicholas T. Ingolia” & Michael Rape!” 


Metazoan development depends on the accurate execution of dif- 
ferentiation programs that allow pluripotent stem cells to adopt 
specific fates’. Differentiation requires changes to chromatin archi- 
tecture and transcriptional networks, yet whether other regulatory 
events support cell-fate determination is less well understood. Here 
we identify the ubiquitin ligase CUL3 in complex with its verte- 
brate-specific substrate adaptor KBTBD8 (CUL3“®™®%) as an 
essential regulator of human and Xenopus tropicalis neural crest 
specification. CUL3®™®?® monoubiquitylates NOLC1 and its 
paralogue TCOF1, the mutation of which underlies the neurocris- 
topathy Treacher Collins syndrome”’. Ubiquitylation drives forma- 
tion of a TCOF1-NOLC1 platform that connects RNA polymerase I 
with ribosome modification enzymes and remodels the trans- 
lational program of differentiating cells in favour of neural crest 
specification. We conclude that ubiquitin-dependent regulation of 
translation is an important feature of cell-fate determination. 
Cullin-RING ligases (CRLs), the largest class of ubiquitylation 
enzymes, have critical roles in metazoan development* '°. CRLs recog- 
nize their substrates through ~300 adaptor proteins, several of which 
are differentially expressed during development'’"’. Although muta- 
tions in CRL adaptors have been linked to human pathology’*”’, little 
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is known about how distinct CRLs ensure robust differentiation into 
specialized cell types. 

To discover CRLs with crucial roles in development, we employed 
genome-wide transcript analysis of differentiating human embryonic 
stem cells (hESCs). These experiments revealed a strong reduction 
in the abundance of the vertebrate-specific CUL3 adaptor KBTBD8 
during hESC differentiation (Extended Data Fig. la-c), which we 
confirmed for KBTBD8 messenger RNA and protein by quantitative 
reverse transcription PCR (qRT-PCR) and western blot analysis 
(Extended Data Fig. 1d—g). Consistent with evolutionary conservation, 
downregulation of KBTBD8 was observed in differentiating mouse 
ESCs (Extended Data Fig. 1h, i), as well as during Xenopus tropicalis 
development”. 

Depletion of KBTBD8 did not affect the cell cycle, survival, 
or pluripotency programs of hESCs (Extended Data Fig. 2a-e). 
Instead, gene expression profiles of hESCs subjected to embryoid body 
differentiation suggested that KBTBD8 was required for neural crest 
specification (Extended Data Fig. 2f and Supplementary Table 1). 
qRT-PCR experiments confirmed that loss of KBTBD8 reduced 
expression of neural crest markers, including FOXD3 and SOX10, 
which was accompanied by an increase in transcripts associated with 


Figure 1 | CUL3“°7??® drives neural crest 
specification. a, hESCs stably depleted of 
KBTBD8 were subjected to neural conversion and 
analysed by qRT-PCR (mean of 3 technical 
replicates, +s.e.m.). b, Depletion of KBTBD8 
results in loss of neural crest cells, as determined by 
western blot analysis (full scans available in 
Supplementary Fig. 1). NC 9d, neural conversion, 
day 9; molecular weight is given in kDa. 

c, KBTBD8-depleted hESCs were subjected to 
neural conversion and analysed by immuno- 
fluorescence microscopy (mean of 3 biological 
replicates, +s.e.m; ~1,500 cells per condition). 

d, Xenopus tropicalis embryos injected with 
translation-blocking morpholinos against 
KBTBD8 were analysed by in situ hybridization. 
e, Model of the CUL3K*"®_controlled 
developmental switch. 
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Figure 2 | CUL3*®1®?8 monoubiquitylates 
Ip TCOF1 and NOLC1. a, High-confidence 
interactors of wild-type (WT) or mutant KBTBD8. 
Left: normalized total spectral counts (TSCs) per 
interactor of wild-type KBTBD8 (sum of 
3 biological replicates per condition). Right: heat 
map depicting binding relative to wild-type 
KBTBD3&. b, Verification of KBTBD8 interactions 
in 293T cells by anti-Flag immunoprecipitation 
and western blot analysis. IP, immunoprecipi- 
tation; molecular weight is given in kDa. 
c, Immunoprecipitation of KBTBD8 from hESCs 


(full scans available in Supplementary Fig. 1). 
d, Ubiquitylated (Ubi) HA-tagged TCOF1 detected 
Sy after denaturing Ni-NTA purification in 293T 
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central nervous system (CNS) precursor and forebrain identity 
(FOXG1, SIX3; Extended Data Fig. 2g). 

On the basis of these observations, we subjected hESCs to dual- 
SMAD inhibition (‘neural conversion’), which directs differentiation 
towards CNS precursor and neural crest cells'*. As seen during embry- 
oid body differentiation, depletion of KBTBD8 caused a striking loss of 
neural crest cells and an increase in CNS precursors (Fig. 1a, b), which 
was seen for multiple short hairpin RNAs (shRNAs) and was rescued 
by shRNA-resistant KBTBD8 (Fig. 3b and Extended Data Fig. 3g). We 
corroborated these results with single-cell resolution using the neural 
crest marker SOX10 (Fig. 1c) or AP2, p75 and HNK1, which are co- 
expressed in most neural crest cells (Extended Data Fig. 3a). KBTBD8 
was required for early neural crest specification, with CNS precursor 
markers accumulating in KBTBD8-depleted cells when neural crest 
markers were first detected in control experiments (Extended Data 
Fig. 3b-h). KBTBD8 was accordingly critical for differentiation of 
hESC-derived neural crest cells into glia, mesenchymal cells, melano- 
cytes, or chondrocytes (Extended Data Fig. 4a, b). Also in Xenopus 
tropicalis, downregulation or inhibition of CUL3“'P?® prevented 
neural crest formation and caused an expansion of the CNS precursor 
territory in the manipulated part of the embryo (Fig. 1d and Extended 
Data Fig. 4c). Thus, CUL3*®'®?8 regulates a developmental switch 
that controls the generation of the neural crest, an embryonic cell 
population that is found only in vertebrates (Fig. le). 

To isolate essential targets of CUL3“°'?8, we used CompPASS 
mass spectrometry to capture proteins that bound wild-type 
KBTBD8 but not variants with a mutant substrate-binding domain 
(KBTBD8(W579A); Extended Data Fig. 5a—d). These interaction net- 
works identified the paralogues NOLC1 and TCOF1 as predominant 
interactors of KBTBD8, which were not recognized by KBTBD8- 
(W579A) (Fig. 2a). Using western blot analysis, we confirmed binding 
of TCOF1 and NOLC1 to KBTBD8 but not KBTBD8(W579A) 
(Fig. 2b), and showed that the same association occurred between 
endogenous proteins in hESCs (Fig. 2c) and in reconstituted in vitro 
systems (Extended Data Fig. 5e, f). Denaturing purification of ubiquitin 
conjugates revealed that KBTBD8, but neither KBTBD8(W579A) 
nor CUL3-binding-deficient KBTBD8(Y74A), induced the robust 
monoubiquitylation of TCOF1 and NOLC1 (Fig. 2d-f). These events 
required a cofactor, B-arrestin, the depletion of which prevented 
KBTBD8 recognition and monoubiquitylation of TCOF1 and 
NOLCI (Extended Data Fig. 5g-)). 
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e, Monoubiquitylation of HA-NOLCI1 by 
CUL3%®1®PS in 293T cells. £, Monoubiquitylation 
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cells reconstituted with KBTBD8 variants and 
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hESCs_ expressing only 


KBTBD8(W579A) or KBTBD8(Y74A) failed to support neural crest 
specification and showed increased abundance of CNS precursors 
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Figure 3 | CUL3®18?8 
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and NOLCI1. a, hESCs were reconstituted with shRNA-resistant KBTBD8 
variants or depleted of KBTBD8-binding partners, subjected to neural conver- 
sion (9d), and analysed by qRT-PCR and unsupervised clustering. b, Protein 
expression during neural conversion of hESCs reconstituted with shRNA- 
resistant KBTBD8 variants (full scans available in Supplementary Fig. 1). 
Molecular weight is shown in kDa. c, Protein expression in hESCs stably depleted 
of KBTBD8, TCOF1, or NOLC1 and subjected to neural conversion. d, hESCs 
were stably depleted of the indicated combinations of KBTBD8, TCOF1, or 
NOLC1, subjected to neural conversion (9 days), and analysed by qRT-PCR. 


©2015 Macmillan Publishers Limited. All rights reserved 


CY 
s 


LETTER 


i ; i x c RNA RNA Ribosome 
r 8 r r n 
: TCOF1 interacting ia 7 £ xe KBTBD8 transcription y modification —» maturation 
£7 eo '@! NOLC1 el H/ACA snoRNP SSU processome 
& bad complex —- — A 
= dependent {Box G/D snoRNP 
Ey 
= ° a RPA1 J RPA2 Wf RPA43 ® © complex | 
es : Ss 'oo 
n 
iC — geek) EBSe RPAC1 | 
2 cy Se ee TCOF1-NOLC1 complex | @e J 
D 2 | @ @ !KBTBDS RPABCI Poor Ynowcs) SSS 
o 1 I binding- | 
S 4 ra ioe aa | Ps | dependent { u) @ 
xe} Le ea 
> of! i ReTBBS. g 50 = aoe ow ARBBI Fold enrichment | | { eeareey 
| ® | independent a | Sse e2e!* ARBB2 E3 ligase Nucleosome | 
ee ee ea : i =| 450 fo / | O »20 fold pep 
- ° eo é ica O > 10 fold & © | 
Fold change TSCs (WT-control) eal | — KBTBD8 O >stold @ ES J 
>o Tol sf @® @Q 
d Western Sectieevie 
1st IP: 
Input anti-HA 2nd IP: f ‘ 
(1%) (10%) anti-Flag e Kg IP: anti-lgG/RPA1 
nput ——_ 
e ry @ = Flag-TCOF1 7 ES “ £ KBTBD8-Flag @ = @ PP:anti-RPA1 
«at / Input Meise 
ee e°e @ @ = HA-NOLC1 Pane) — eo @ IP: anti-IgG 
Ras) e@ © © =— @ IP: anti-RPA1 e @ @ shKBTBD8 
250 @@ | Flag-TCOF1 I 5 >~>se e e e IP: anti-IgG e ee shCtrl 
o{ = BM = Jranorcs eat! | ee oy ooo Sw rPa 
75 4 
Ca ee ee 
ARRB1 5 pa 
sof SK apres Less = KBTBD8 ®  ———__xerave 
20{—— wed =] RPAI »(-——. = NOPS8 ap - | NOPS8 
— CSK2A 
ee ‘= ee KS 
250 | @ RPB1 Western Western 
-_-=- @e — | DKC1 
50 g 
gi=2 os — | NHP2 
~3iae @e@ 3 —|NoPss TCOF’ 
slab eo _. | NOP56 Fra —_ > 
a 
{oS BB = Josue NOLC1 CS 


Western 


Figure 4 | Ubiquitylation-dependent TCOF1-NOLC1 complexes couple 
RNA polymerase I to ribosome modification enzymes. a, Interactors of 
TCOF1 in 293T cells reconstituted with KBTBD8 or KBTBD8(Y74A) (sum of 3 
biological replicates per condition). b, Validation of CUL3%8!8P8. dependent 
formation of TCOF1-NOLC1 complexes. c, CompPASS mass spectrometry 
analysis of sequential immunoprecipitation of Flag-TCOF1/HA-NOLC1 
complexes. d, Validation of sequential affinity purification of KBTBD8- 


(Fig. 3a, b and Extended Fig. 6a, b). The same aberrant differentiation 
program was observed if we depleted TCOF1 or NOLCI (Fig. 3a, cand 
Extended Data Fig. 6a, c, d), but not other KBTBD8-binding partners 
(Fig. 3a and Extended Data Fig. 6e, f). Demonstrating that these pro- 
teins act in a common pathway, co-depletion of KBTBD8 and TCOF1 
or NOLC1, respectively, mirrored the differentiation program of singly 
depleted hESCs (Fig. 3d). We therefore conclude that TCOF1 and 
NOLCI are critical monoubiquitylation substrates of CUL3S?788 
during neural crest specification. Consistent with this notion, muta- 
tions in TCOF1 cause Treacher Collins syndrome, a craniofacial dis- 
order characterized by loss of cranial neural crest cells*”. 

To understand how CUL3“®72°8 drives neural crest specification, 
we identified proteins that selectively recognized ubiquitylated, 
but not unmodified, TCOF1 using cells that were reconstituted with 
either wild-type KBTBD8, inactive KBTBD8(Y74<A), or empty vector. 
Notably, NOLC1 emerged as the major effector that was recruited to 
ubiquitylated TCOF1 (Fig. 4a), an observation that was confirmed by 
affinity purification coupled to western blot analysis (Fig. 4b). 
Monoubiquitylation often stabilizes binding partners, and depletion 
of KBTBD8 caused degradation of both TCOF1 and NOLC] at later 
stages of neural conversion (Fig. 3c and Extended Data Fig. 7a, b). 

On the basis of these results, we established a sequential affinity 
purification protocol to determine the composition of ubiquitylation- 
dependent TCOF1-NOLCI1 complexes. We found that TCOF1-NOLC1 
assemblies engaged RNA polymerase I; the H/ACA complex catalysing 
rRNA pseudouridylation; and the SSU processome controlling 


dependent TCOF1-NOLC1 complexes (full scans available in Supplemen- 
tary Fig. 1). e, Immunoprecipitation of RNA polymerase I from 293T cells 
reconstituted with KBTBD8 variants. f, Immunoprecipitation of RNA 
polymerase I from hESCs depleted of KBTBD8. g, Model of ubiquitin- 
dependent formation of a TCOF1-NOLC1 platform. Molecular weight is 
given in kDa. 


maturation and modification of the small ribosomal subunit 
(Fig. 4c, d and Extended Data Fig. 7c). Accordingly, ubiquitylation 
by CUL3*?™?P8 brought endogenous RNA polymerase I into com- 
plexes with the SSU processome (Fig. 4e), which required TCOF1 and 
NOLCI (Extended Data Fig. 7d). Similar observations were made in 
hESCs, where a robust interaction between RNA polymerase I and 
SSU processome was lost upon depletion of KBTBD8 (Fig. 4f). 
Thus, CUL3“®'®?8 induces the ubiquitin-dependent formation of 
TCOF1-NOLC1 complexes that serve as a platform to connect RNA 
polymerase I with enzymes responsible for ribosomal processing and 
modification (Fig. 4g). This observation supports a role of ubiquityla- 
tion in neural crest specification, as mutations in RNA polymerase I 
also cause Treacher Collins syndrome”. 

Although KBTBD8 targets proteins linked to ribosome biogenesis, 
its depletion did not affect the abundance of rRNAs or mRNAs 
encoding ribosomal proteins; levels of ribosomal proteins; processing 
of precursor rRNAs; nucleolar integrity; export of the small ribosomal 
subunit; ribosome binding to mRNA judged by polysome gradient 
analysis; global mRNA translation detected by metabolic labelling; or 
cell survival (Fig. 5a, d and Extended Data Fig. 8a-h). Accordingly, a 
global reduction in translation caused by rapamycin did not pheno- 
copy the loss of KBTBD8 (Extended Data Fig. 9a, b). Depletion of 
TCOF1 also did not affect rRNA synthesis, p53 activation, or cell 
survival at the time of neural crest specification (Extended Data Fig. 
9c-e), although consistent with previous work*”, it reduced rRNA 
levels and triggered cell death at late stages of neural conversion 
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Figure 5 | CUL3“®7®?® remodels translational 
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(Extended Data Fig. 9f, g). These observations implied that KBTBD8 
and TCOF! initially specify neural crest fate without altering ribo- 
some abundance, global mRNA translation, or cell survival. 

We therefore considered the possibility that CUL3®"®?*-dependent 
assembly of a ribosome modification platform might produce ribo- 
somes with distinct translational output. Indeed, as seen by RNA 
sequencing and ribosome profiling, depletion of KBTBD8 changed 
the translational program of cells undergoing neural conversion, 
whereas it had no effect on protein synthesis in hESCs (Fig. 5b and 
Extended Data Fig. 10a). Similar observations were made for TCOF1, 
and the translation efficiency profiles of differentiating hESCs lacking 
KBTBD8 or TCOF1 were correlated (Extended Data Fig. 10b). Loss of 
KBTBD8 caused changes in translation immediately after initiation of 
differentiation, and thus, before specification of hESCs into neural 
crest or CNS precursor cells (Fig. 5c). 

Analysis of regulated mRNAs showed that KBTBD8 suppressed the 
production of proteins specifying CNS precursors, whereas it did not 
affect translation of mRNAs connected to neural crest specification 
(Extended Data Fig. 10c). In this manner, KBTBD8 or TCOF1 delayed 
the accumulation of CNS precursor proteins, including ATRX and 
PCM1, until neural crest specification had occurred (Fig. 5d and 
Extended Data Fig. 10d). Underscoring the role of translational control, 
KBTBD8 enforced the correct timing of ATRX and PCM1 production 
without regulating their mRNA levels or protein stability (Fig. 5c, d and 
Extended Data Fig. 10d-f). The depletion of KBTBD8 also reduced the 
translation of mRNAs encoding histones or ribosomal components 
(Fig. 5b), yet as expected from their long half lives**”, the levels of 
the corresponding proteins were not diminished during our differenti- 
ation experiment (Fig. 5d). Thus, the CUL3“"'?%- dependent forma- 
tion of a ribosome modification platform alters translation of specific 
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mRNAs, which delays accumulation of CNS precursor proteins until 
hESCs have accomplished neural crest specification. 

Our work documents an important role for ubiquitylation in 
remodelling translational programs during differentiation, and defines 
an early function for the CUL3“®'®® substrate and Treacher Collins 
syndrome-associated protein TCOF1 in neural crest specification (Fig. 
5e). We hypothesize that CUL3“®™PP® and TCOF1 may govern the 
production of differentially modified ribosomes, potentially including 
post-transcriptional changes in rRNA pseudouridylation and base 
methylation, or phosphorylation and ubiquitylation of ribosomal pro- 
teins or ribosome-associated factors. Such modifications may affect 
the interactions of ribosomes with select mRNAs, with factors that 
deliver specific mRNAs to the ribosome, or with proteins that control 
the synthesis or degradation of distinct mRNAs”. 

Together with studies implying specific functions for ribosomal 
proteins during differentiation**”’, developmental switches controlled 
by ribosomal regulation might explain why mutations in ribosome 
biogenesis factors result in tissue-specific ribosomopathies””?°?**°, 
Manipulating such switches could lead to therapeutic strategies for 
paediatric diseases: as Treacher Collins syndrome is caused by muta- 
tion of a single TCOF1 allele, increasing the efficiency of KBTBD8- 
dependent ubiquitylation of the remaining wild-type TCOF1 might 
reconstitute ribosomal regulation and neural crest formation. 

Online Content Methods, along with any additional Extended Data display items 


and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. 

Plasmids, shRNAs, siRNAs, morpholinos. Full-length KBTBD8 was cloned into 
pCDNAS untagged, with a C-terminal 3 x Flag tag, or a C-terminal HA tag, for 
expression in human cells, into pMAL with an N-terminal MBP tag for expression 
in Escherichia coli, and into pFastBac with an N-terminal 6 X His tag for express- 
ion in Sf9 ES insect cells using the Bac-to-Bac baculovirus expression system 
(Invitrogen). For expression in human embryonic stem cells, KBTBD8 was cloned 
into pENTRIA with a C-terminal Flag tag and recombined into pLenti-PGK- 
Hygro. Point mutants in KBTBD8’s BTB-BACK domain (Y74A, M78D, L73D, 
ST86,87AA, Y43A) and KBTBD8’s propeller domain (1352A, F450A, MF496, 
497AA, F550A, and W579A) were introduced into these vectors by site-directed 
mutagenesis and digestion of parental DNA with Dpnl. Full-length coding 
sequences of TCOFI and NOLC1 were cloned into pCMV-3 X Flag or pCS2- 
HA vectors for expression in human cells and in vitro transcription/translation 
(IVT/T), respectively, as described*’. His—ubiquitin was cloned into a pCS2 vector 
for expression in cells*”. To generate a deubiquitylation-resistant ubiquitin variant, 
a L73P mutation was introduced in this vector by site-directed mutagenesis as 
described above. pLKO1-Puro Mission shRNA constructs targeting KBTBD8 (#1, 
TRCN0000130280; #2, TRCN0000128536), NOLC1 (#1, TRCN000061971; #2, 
TRCN0000298197), TCOFI (#1, TRCN000008630), and BRD2 (#1, TRCN0000- 
006310; #2, TRCN0000381007), DHX15 (#1, TRCN0000000006; #2, TRCN0000- 
425479), PCNT (#1, TRCN0000162352; #2, TRCN0000162298) and ANKUBI (#1, 
TRCN0000336646) were purchased from Sigma. siRNAs were from Dharmacon 
and sequences were as follows: scrambled control (UAGCGACUAAACACA 
UCAAUU), ARRBI (#1, CGGAGAAUUUGGAGGAGAAUU; #2, UCAUAGA 
ACUUGACAAAUU), ARRB2 (#1, ACAAGGAGGUGCUGGGAAUUU; #2, 
CUAAAUCACUAGAAGAGAAUU), NOLCI1 (#1, CCAAGAAGGCUGUGGA 
GAAUU; #2, CAGUUAAAGCUCAGAUAAUU; #3, UCUCAGAGGUGGCC 
AAUAAUUV), TCOFI (#1, CCAUCAAGCAUGAAAGAAUU; #2, GGAAAGG 
CCUCCAGGUGAAUU; #3, GGAAUCAGACAGUGAGGAAUU). 

Four morpholino oligonucleotides (Gene Tools, LLC) were used: kbtbd8 trans- 

lation blocking, 5'’-GTGCAGGAAACGTCACTTACTTCCT-3’; kbtbd8 splice 
blocking, 5’-TCTCCCAGCCCCAAACAACC-3’; cul3 splice blocking, 5’-AAG 
TATCCTATGAGTCTCACCGGGA-3’; and a control tracer morpholino with 
3’-fluorescein modification, 5’-CCTCTTACCTCAGTTACAATTTATA-3’. 
Dominant-negative CUL3 (N-terminal 250 amino acids of human CUL3) was 
cloned into pCS2+ and mRNA was synthesized for injection using an in vitro 
transcription system. 
Proteins. KBTBD8, KBTBD8(Y74A) and KBTBD8(W579A) were purified from 
ES Sf9 cells 72h after transduction. Lysates were prepared in 50mM sodium 
phosphate, pH 8, 500 mM NaCl, and 10 mM imidazole by incubation with 200 pg 
ml ' lysozyme and sonication. Lysates were cleared by centrifugation and incu- 
bated with Ni-NTA (QIAGEN) for 2 hat 4 °C. Beads were washed with lysis buffer 
containing 0.1% Triton. Proteins were eluted in 200mM imidazole in 50 mM 
sodium phosphate, pH 8, 500 mM NaCl. TEV protease for His-tag removal was 
added and proteins were dialysed overnight into 50mM Tris pH 8.0, 
100 mM NaCl, 2mM DTT. Proteins were further purified by molecular sieving 
over a preparative Superdex 200 column in 50mM Tris pH 8.0, 100 mM NaCl, 
2mM DTT. KBTBD8 or KBTBD8 mutant fractions were concentrated, aliquoted, 
flash-frozen in liquid nitrogen, and stored at — 80 °C. 

MBP-tagged KBTBD8, KBTBD8(Y74A), KBTBD8(F550A) = and 
KBTBD8(W579A) were expressed and purified from BL21/DE3 (RIL) cells. 
Cells were grown in LB-medium to ODgoo nm 0.5 followed by addition of 
0.5mM IPTG and protein production for 16h at 18°C. Cells were harvested, 
lysed in 50mM sodium phosphate, pH 8, 500mM NaCl, by incubation with 
200 rg ml lysozyme and sonication, and cleared by centrifugation. MBP-tagged 
proteins were isolated using amylose resin (NEB), washed with lysis buffer, and 
eluted in lysis buffer containing 10mM maltose followed by dialysis into PBS 
containing 2mM DTT. 

Antibodies. Mouse monoclonal anit-KBTBD8 antibodies were produced with 
Promab using MBP-KBTBD8-250-601 as antigen (1 gml~* in immunoblots 
(IB) and 10 pug mg | lysate in immunoprecipitations (IP)). Anti-PAX6 (DSHB, 
clone P3U1, 1:1,000 in IB), anti-PAX6 (Biolegend, PRB-278B, 1:300 in IF), anti- 
TDGF (#4193, clone D81B12, Cell Signaling, 1:1,000 in IB), anti- TFAP2 (#2509, 
Cell Signaling, 1:1,000 in IB, 1:100 in IF), anti-CDH1 (#3195, clone 24E10, Cell 
Signaling, 1:1,000 in IB), anti-FOXGI1 (ab18259, Abcam, 1:250 in IB), anti-p75 
(AB-NO07, clone ME20.4, Advanced Targeting Systems, 1:100 in IF), anti- HNK1 
(C6680, clone VC1.1, Sigma, 1:500 in IF), anti-SOX10 (ab155279, Abcam, 1:2,000 
in IB), anti-SOX10 (ab27655, Abcam, 1:100 in IF), anti-GFAP (#3670, clone GAS, 
Cell Signaling, 1:100 in IF), anti-SMA (A2547, clone 1A4, 1:100 in IF), anti- NF-L 
(#2837, clone C28E10, Cell Signaling, 1:100), anti-Tujl (#5568, clone D71G9, Cell 


Signaling, 1:100 in IF), anti-rRNA 5.8 (ab347144, clone Y10b, Abcam, 1:1,000 in 
IF), anti-TCOF1 (11003-1-AP, Proteintech, 1:250 in IB), anti- TCOF1 (sc-49529, 
Santa Cruz, 1:100 in IF), anti-FBL (ab5821, Abcam, 1:100 in IF), anti- NOLC1 
(11815-1-P, Proteintech, 1:1,000 in IB), anti-ARRB1/2 (#4674, clone D24H9, Cell 
Signaling, 1:1,000 in IB), anti-PKN1 (610687, clone 49/PRK1, BD Bioscience, 
1:1,000 in IB), anti-CUL3 (Bethyl, 1:1,000 in IB), anti-NANOG (#3580, Cell 
Signaling, 1:1,000 in IB), anti-OCT4 (ac-8628, Santa Cruz, 1:1,000 in IB), anti- 
SNAIL2 (#9585, clone C19G7, Cell Signaling, 1:500 in IB), anti-GAPDH (#2118, 
clone 14C10, Cell Signaling, 1:10,000 in IB), anti-BRD2 (#5848, clone D89B4, Cell 
Signaling, 1:500 in IB), anti-PCNA (Santa Cruz, 1:5,000 in IB), anti-RPA1 (Santa 
Cruz, 1:100 in IB, 4 ug per 1 mg lysate in IP), anti-RPA2 (sc-17913, Santa Cruz, 
1:100 in IB), anti-RPB1 (sc-5943, Santa Cruz, 1:100 in IB), anti-DKC1 (sc-48794, 
Santa Cruz, 1:1,000 in IB), anti- NHP2 (sc-366967, Santa Cruz, 1:500 in IB), anti- 
NOP56 (A302-720A, Bethyl, 1:1,000 in IB), anti-NOP58 (A302-719A, Bethyl, 
1:1,000 in IB), anti-CSK2A (#2656, Cell Signaling, 1:500 in IB), anti-cleaved cas- 
pase-3 (#9664, clone 5A1E, Cell Signaling, 1:100 in IB), anti-ATRX (A301-045A, 
Bethyl, 1:500 in IB), anti-PCM1 (A301-149A, Bethyl, 1:500 in IB), anti-SNF2L 
(#12438, clone D4Q7V, Cell Signaling, 1:500 in IB), anti-HA (clone C29F4; Cell 
Signaling, 1:3,000 in IB), anti-Flag (F1804, clone M2, Sigma, 1:2,000 in IB), and 
anti-Flag (F7425, Sigma, 1:2,000 in IB) antibodies were commercially purchased. 
Mammalian cell culture and transfections. Human embryonic kidney (HEK) 
293T cells were maintained in DMEM with 10% fetal bovine serum. Plasmid 
transfections of HEK 293T cells were with calcium phosphate and siRNA trans- 
fections were with Lipofectamine RNAiMAX (Invitrogen) according to the man- 
ufacturer’s instructions using 10 nM for each siRNA. 

hES cell culture, lentiviral infections and hES differentiations. Human embry- 
onic stem (hES) H1 cells were obtained from the Wisconsin stem cell bank, 
routinely characterized for mycoplasma contamination, and maintained under 
feeder free conditions on Matrigel-coated plates (#354277, BD Biosciences) in 
mTeSR1, (#05871/05852, StemCell Technologies Inc.) and were routinely pas- 
saged with collagenase (#07909, StemCell Technologies Inc.) and ReLesR 
(#05872, StemCell Technologies Inc.). 

Lentiviruses were produced in 293T cells by co-transfection of lentiviral con- 
structs with packaging plasmids (Addgene) for 48-72 h. Transduction was carried 
out by infecting 30% confluent hES H1 cells with lentiviruses in the presence of 
6 wg ml! Polybrene (Sigma). After 7 days of selection with appropriate antibiotic 
(0.5 pg ml! puromycin for pLKO1-puro-shRNA constructs, 500 tg ml” ' hygro- 
mycin for pLKO1-hygro-shRNA or pLenti-Hygro constructs), hES H1 cells were 
analysed and used in differentiation experiments. 

Embryoid body (EB) formation from hES H1 cells and hES H1 cells expressing 
control shRNA or shRNAs targeting KBTBD8 was performed using Aggrewell 
800 plates (#27865, StemCell Technologies Inc.) and APEL medium (#05210, 
StemCell Technologies Inc.) following the guidelines of the manufacturer’s tech- 
nical manual (#29146). In brief, single-cell suspensions were prepared by treat- 
ment of hES cells with accutase (#07920, StemCell Technologies Inc.) and 1 X 10° 
cells were seeded per well of an Aggrewell 800 plate in APEL medium supplemen- 
ted with 10 uM Y-27632 ROCK inhibitor (Calbiochem). 24h after seeding, EBs 
were harvested and transferred into ultra-low adherence culture dishes (Corning) 
with < 1,000 EBs per well of a 6-well plate and differentiated in APEL medium for 
3 and 6 days. Medium was replaced every other day. 

Neural induction of hES H1 cells expressing different shRNA constructs was 
performed using STEMdiff Neural Induction Medium (#05831, StemCell 
Technologies Inc.) in combination with a monolayer culture method according 
to the manufacturer’s technical bulletin (#28044) and as previously described’*. In 
brief, single-cell suspensions were prepared by treatment of hES cells with accutase 
and 1.25-1.5 X 10° cells were seeded per well of a 6-well plate in STEMdiff Neural 
Induction Medium supplemented with 10 4M Y-27632 ROCK inhibitor. Neural 
induction was performed for 1, 3, 6 and 9 days with daily medium change. For 
inhibition of general protein translation, hESCs were subjected to the neural 
conversion protocol in the presence of 50 or 100nM rapamycin. 

For long-term neural conversion experiments to assess spontaneous differenti- 

ation of neural crest cells into derivatives, hESCs were subjected to neural conver- 
sion using STEMdiff Neural Induction Medium as described above for 43 days. 
Medium was changed daily until 18 days, then every other day. 
Microinjections and in situ hybridizations. Morpholinos (20 ng) and mRNA 
(100 pg) with 2 ng of tracer morpholino were injected into the animal cap in 1 
blastomere of 2-cell stage Xenopus tropicalis embryos. At stage 10-14, embryos 
were sorted by left or right injection side via tracer morpholino fluorescence. 
Embryos were developed to stage 16-18 and fixed for 4-6 h in MEMFA at room 
temperature. In situ hybridization of X. tropicalis embryos with digoxygenin- 
labelled RNA probes was performed using a multi-basket method as described 
previously*’. Sorting and imaging were performed on a Zeiss SteREO Lumar.V12 
microscope. 
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Cycloheximide (CHX) chase assays. For cycloheximide chase assays, control or 
KBTBD8-depleted hES H1 cells and cells that had undergone neural induction for 
3 days were treated with 40 gml' CHX for 2, 4 and 6h. Cells were lysed in 8M 
urea, 50mM sodium phosphate, pH 8.0. Lysates were diluted in SDS loading 
buffer, sonicated, and were analysed by immunoblotting. For quantification, 
immunoblot signals for respective proteins were quantified using ImageJ (NIH, 
http://rsbweb.nih.gov/ij/) and normalized to GAPDH or B-actin. 

Gene expression analysis by microarray. To compare gene expression profiles of 
hES H1 cells versus embryoid bodies (6 days), gene expression profiles of control 
versus KBTBD8-depleted hES H1 cells, and gene expression profiles of control 
versus KBTBD8-depleted embryoid bodies (6 days), we isolated and purified total 
RNA from respective samples using the RNeasy Mini Kit (Qiagen, catalogue no. 
74104). Microarray analysis was performed in biological triplicates by the 
Functional Genomics Laboratory (UC Berkeley) using the Affymetrix HUMAN 
GENE 1.0 ST ARRAY. 

Quantitative real-time PCR (qRT-PCR) analysis. For qRT-PCR analysis, total 
RNA was extracted and purified from cells using the RNeasy Mini Kit (Qiagen, 
catalogue no. 74104) and transcribed into cDNA using the RevertAid first 
strand cDNA synthesis kit (#K1621, Thermo Scientific). Gene expression was 
quantified by Maxima SYBR Green/Rox qPCR (#K0221, Thermo scientific) on 
a StepOnePlus Real-Time PCR System (Applied Biosystems). Nonspecific signals 
caused by primer dimers were excluded by dissociation curve analysis and use of 
non-template controls. To normalize for loaded cDNA, -actin or RPS6 was used 
as endogenous control. Gene-specific primers for qRT-PCR were designed by 
using NCBI Primer-Blast or ordered pre-designed from Integrated DNA 
Technologies. Primer sequences can be found in Supplementary Table 2. 
Cluster analysis. To determine shRNA treatments that caused similar effects on 
neural conversion of hESCs, we performed cluster analysis of mRNA expression 
profiles. We stably transduced H1 hESCs with lentiviruses expressing various 
shRNAs and/or shRNA-resistant cDNAs and subjected these cells to neural con- 
version by dual SMAD inhibition for 9days. We then measured the mRNA 
abundance of neural progenitor and neural crest markers by RT-qPCR. Data sets 
were clustered using the heatmap.2 function of the gplots package on R. 

In vivo ubiquitylation assays. To detect ubiquitylation of ectopic TCOF1 or 
NOLCI1, HEK 293T cells were transiently transfected with 6 X His-tagged ubiqui- 
tin and HA-tagged TCOF1 or NOLC1. For detection of ubiquitylation of endo- 
genous TCOF1 or NOLC1, HEK 293T cells were transiently transfected with 
6 X His-tagged ubiquitin-L73P. Cells were harvested, washed with PBS, lysed in 
in 8M urea, 50mM sodium phosphate, pH 8.0 and sonicated. His—ubiquitin 
conjugates were purified using Ni-NTA agarose (Qiagen) and ubiquitylated 
NOLC1 or TCOF1 was detected by immunoblotting using anti-HA, anti- 
TCOFI, or anti-NOLC1 antibodies. For analysis of the influence of B-arrestin 
proteins on KBTBD8-mediated ubiquitylation, HEK 293T cells were transfected 
with control siRNAs or a pool of siRNAs targeting ARRB1 (#1 and #2) and ARRB2 
(#1 and #2) 24h before plasmid transfection. 

Immunoprecipitations for mass spectrometry. Anti-Flag immunoprecipita- 
tions (IPs) for mass spectrometry analysis were performed from extracts of 
HEK 293T cells transiently expressing KBTBD8-3 X Flag versions or 3 X Flag- 
TCOF1 in the presence and absence of KBTBD8 versions (20 X 15cm dishes per 
condition). Lysis was in two pellet volumes of 20mM HEPES pH 7.3, 
150mM NaCl, 110mMKOAc, 2mMMg(OAc),, 5mMEDTA, 5mMEGTA, 
0.2% NP-40, and protease inhibitors (Roche) on ice. Lysates were sonicated, 
cleared by centrifugation, passaged through a 0.45 jim membrane filter, and incu- 
bated with anti-Flag-M2 agarose (Sigma) for 2h at 4 °C. After washing with lysis 
buffer, Flag-tagged protein complexes were eluted with lysis buffer containing 
0.5mgml | 3 X Flag peptide in three 15min incubations at 30°C, 800rpm. 
Eluates were either analysed by immunoblotting or further processed for multi- 
dimensional protein identification technology (MUDPIT) mass spectrometry. 
Sequential NOLC1-TCOF1 IPs were performed from extracts of HEK 293T cells 
transiently expressing 3 HA-NOLC1, 3X Flag-TCOF1, and untagged 
KBTBD8 (120 X 15cm plates for mass spectrometry). Lysates were prepared as 
described above and incubated with anti-HA agarose (Sigma) for 2h at 4 °C. After 
washing with lysis buffer, bound proteins were eluted with lysis buffer containing 
0.5mgml | 3 X HA peptide in three 15 min incubations at 30°C, 800 rpm. HA 
eluates were subjected to anti-Flag immunoprecipitation followed by Flag peptide 
elution (see above). 

Endogenous immunoprecipitations. Anti-KBTBD8 immunoprecipitations 
were performed from hES H1 cells (5 X 15cm dishes per condition) and lysates 
were prepared as described above. After incubation with KBTBD8 or control 
antibodies (mIgGs, Santa Cruz) at 4°C for 1h, protein G beads (Roche) were 
added for 2h. After washing with lysis buffer, bound proteins were eluted with 
2 X SDS sample buffer and analysed by immunoblotting. anti-RPA1 IPs were 
performed in the same way but from either HEK 293T cells transiently expressing 
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KBTBD8-Flag, KBTBD8(W579A)-Flag, or KBTBD8(Y74A)-Flag (2 x 15cm 
plates for each condition) or from control or KBTBD8-depleted hES H1 cells 
(5 X 15cm plates for each condition). 

Mass spectrometry and compPASS analysis. For mass spectrometry analysis, 
Flag immunoprecipitates were prepared as described above and precipitated with 
20% trichloroacetic acid (TCA, Fisher) overnight. Proteins were resolubilized and 
denatured in 8 M urea (Fisher), 100 mM Tris (pH 8.5), followed by reduction with 
5mM TCEP (Sigma), alkylation with 10 mM iodoacetamide (Sigma), and over- 
night digestion with trypsin (0.5 mg ml“ ', Fisher). Samples were analysed by the 
Vincent J. Coates Proteomics/Mass Spectrometry Laboratory at UC Berkeley and 
compared with ~100 reference immunoprecipitations against different Flag- 
tagged bait proteins using a Java script programmed according to the 
CompPASS software suite**. For determination of the KBTBD8 interaction net- 
work, three independent KBTBD8-Flag IPs were compared as replicates against 
the reference IPs. Thresholds for high confidence interaction partners (HCIPs) 
were top 5% of interactors with highest Z-score and highest WD score. To 
narrow down putative substrates of KBTBD8 in the interaction map, we compared 
relative total spectral counts for each HCIP found in wild-type KBTBD8 immu- 
noprecipitates to the ones found in KBTBD8(Y74A), KBTBD8(F550A) and 
KBTBD8(W579A) immunoprecipitates. For identification of effector proteins 
recruited to TCOF1 upon ubiquitylation, we determined the TCOF1 interaction 
network as described above for KBTBD8 and compared relative total spectral 
counts for each TCOF1 HCIP to those found upon co-expression of KBTBD8 
or KBTBD8(Y74A). We then plotted relative TSC changes upon KBTBD8 
expression against the difference of relative TSC changes upon KBTBD8 and 
KBTBD8(Y74A) expression. 

Immunofluorescence microscopy. For immunofluorescence analysis, hES H1 
cells or hES H1 cells expressing different shRNA were seeded on Matrigel-coated 
coverslips using accutase, fixed with 3.7% formaldehyde for 10 min, permeabilized 
with 0.1% Triton for 20 min, and stained with indicated antibodies or Hoechst. 
Images were taken using Zeiss LSM 710 confocal microscope or Olympus IX81 
microscope, deconvolved using Metamorph, and processed using ImageJ. 
Determination of average nucleolar size. To analyse nucleolar integrity, we 
performed indirect immunofluorescence microscopy using antibodies against 
fibrillarin, an established nucleolar marker. We stained nucleoli of control and 
KBTBD8-depleted hESCs or of hESCs that were subjected to neural conversion for 
3 days. Images were taken for each condition using a Zeiss LSM 710 confocal 
microscope with a 20 X objective followed by quantification of average nucleolar 
and nuclear size using ImageJ. Average nucleolar size was expressed relative to 
average nuclear area. Error bars represent standard deviation of three different 
images (~100 cells per image). 

Analysis of cell cycle progression. For DNA content analysis, control or 
KBTBD8-depleted hES H1 cells or control or KBTBD8-depleted cells that had 
undergone neural induction for 3 or 6 days were fixed in 70% ethanol in PBS 
overnight. Cells were pelleted and resuspended in PBS containing 1 mgml * 
RNase (Sigma) and 10 pg ml | propidium iodide (PI), incubated at room tem- 
perature for 30min, then analysed using a Beckman-Coulter EPICS XL Flow 
Cytometer (575 nm band pass filter). 

Cell proliferation assays. To compare the division rate of control and KBTBD8- 
depleted hES cells, we seeded 3 X 10° cells per well of a 6-well plate using accutase. 
Cells were accutased at 2, 3 and 4 days post seeding and counted using a haemo- 
cytometer. 

RNA sequencing and ribosome profiling. RNA-seq libraries were prepared with 
Tru-seq Ribo-zero gold kit (Illumina). The preparation of ribosome profiling 
library and the data analysis were performed according to the method previously 
described’. The libraries were sequenced on a HiSeq 2000 (Illumina). The reads 
were aligned to the hg19 human genome reference and the resulting aligned reads 
were mapped to UCSC known reference genes. Based on length of each footprint, 
specific A-site offsets were estimated as 14 for 26-28 nucleotides and 15 for 29-31 
nucleotides. For mRNA fragments, we used offset 14. For measuring footprint 
density and mRNA fragments between samples, we restricted our analysis to genes 
that have at least 128 summed counts in each sample, only including the genomic 
positions 15 codons following the start codon and the position 5 codons preceding 
the stop codon. DESeq”** was used to calculate fold change enrichment of genes by 
KBTBD8 or TCOF1 knockdown at each time point after neural induction. 
Polysome profiling. Cells lysate was prepared as described**. Lysate containing 
3 ug total RNA was loaded on to 10-50% linear sucrose gradients containing 
20 mM Tris pH 7.4, 150 mM NaCl, 5mM MgCl, 1 mM DTT, 100 pg ml cyclo- 
heximide, and 2Uml ! SUPERase in RNase inhibitor and centrifuged at 
36,000 rpm for 2.5 h at 4 °C with SW41 rotor (Beckman Coulter). UV absorbance 
of fractionated gradient with Gradient station (Biocomp) was detected by ECONO 
UV monitor (Biorad). Monosome/Polysome ratio was determined by integration 
of the area under the respective peaks using Igor Pro software (WaveMetrics). 
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Metabolic labelling. To determine global mRNA translation in control or 
KBTBD8-depleted hESCs, hESCs subjected to embryoid body formation for 
3 days, or hESCs subjected to neural conversion for 3 days, we employed metabolic 
labelling. Cells were pre-equilibrated in labelling medium (RPMI without methio- 
nine containing 10% dialysed FBS) for 15 min, followed by preparation of single- 
cell suspensions by accutase treatment. 0.6 X 10° cells for hESCs and hESCs that 
were subjected to neural conversion for three days, and ~300 EBs (initial seeding 
cell number: 3,000 cells per EB) were pulsed with 100 ul labelling medium contain- 
ing 0.1 mCiml* [*°S]-methionine at 37 °C for 20 min. After washing with PBS, 
cells were lysed in 8 M urea, 100 mM Tris pH 8.0, diluted in SDS loading buffer, 
sonicated, and lysates were analysed by SDS-PAGE and audioradiography. [*°S]- 
methionine incorporation was quantified using Image J software, normalized to 
total protein amount, and expressed relative to the control hESC sample (set to 1). 
Error bars denote standard deviation of three biological replicates. 
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Extended Data Figure 1 | KBTBD8 is a developmentally regulated CUL3 replicates + s.e.m.). e, Protein levels of KBTBD8 during hESC differentiation 
adaptor. a, Gene expression analysis by microarray of hESCs differentiated into EBs, as seen by western blot (OCT4, NANOG: pluripotency; PAX6: CNS 
into embryoid bodies (EB) for 6 days (n > 30,000 transcripts, mean of 3 precursors; TFAP2: neural crest marker). f, KBTBD8 is expressed in hESCs, 
biological replicates, analysis of variance (ANOVA) P value <0.05; blue, but not in somatic cell lines, as determined by qRT-PCR (mean of 3 technical 
downregulated genes; red, upregulated genes). b, Expression analysis ofallCRL replicates + s.e.m.). g, Abundance of KBTBD8 in H9 hESCs, D3 mESCs, or 
substrate adaptors, including KBTBD8, with data derived from the experiment — somatic cell lines was determined by western blot analysis. h, KBTBD8 


described above. c, Expression analysis of CUL3 adaptors during hESC expression is downregulated during mouse embryonic stem cell (mESC) 
differentiation into hEBs (blue, downregulation; yellow, upregulation). differentiation into mouse embryoid bodies, as determined by gRT-PCR 

d, mRNA levels of pluripotency markers and KBTBD8 during hESC (mean of 3 technical replicates + s.e.m.). i, KBTBD8 protein levels are reduced 
differentiation into EBs, as determined by qRT-PCR (mean of 3 technical during mESC differentiation, as shown by western blot analysis. 
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Extended Data Figure 2 | KBTBD8 controls neural crest formation. 

a, Stable depletion of KBTBD8 from H1 hESCs, as determined by western blot 
analysis. b, KBTBD8 depletion does not significantly change the cell cycle 
profile of hESCs, as determined by propidium iodide staining and FACS. 

c, Control or KBTBD8-depleted hESCs were counted at indicated times after 
seeding (mean of 3 biological replicates, +s.d.). d, KBTBD8 depletion does not 
induce apoptosis in hESCs, as shown by immunostaining against cleaved 
caspase 3 (red) or DNA (blue) (200 cells per condition; scale bar, 10 jim). 

e, KBTBD8 depletion does not affect the gene expression profile of hESCs, as 
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determined by microarray analysis (genes > 2.5-fold change, n > 30,000; mean 
of 3 biological replicates, ANOVA P-value <0.05). f, Loss of KBTBD8 causes a 
decrease in the expression of neural crest cell markers during EB formation, 
as shown by comparative microarray analysis (genes > 2.5-fold change, 

n > 30,000; mean of 3 biological replicates, ANOVA P-value <0.05). g, mRNA 
levels of pluripotency and differentiation markers in EBs stably expressing 
control or KBTBD8 shRNAs were measured by qRT-PCR (3 technical 
replicates + s.e.m.). 
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Extended Data Figure 3 | KBTBD8 controls neural crest specification. 

a, Depletion of KBTBD8 from hESCs subjected to neural conversion results in 
loss of neural crest cells, as determined by immunofluorescence against HNK1, 
TFAP2 and p75 (n > 200 cells, mean of 3 biological replicates + s.d.). b, H1 
hESCs transduced with control (green) or KBTBD8 shRNAs (red) were 
subjected to neural conversion, and expression of neural crest markers SOX10 
(circles) and SNAIL2 (boxes) was monitored by qRT-PCR (mean of 3 technical 
replicates + s.e.m.). ¢, H1 hESCs described above were subjected to neural 
conversion, and abundance of CNS precursor markers SOX2 (circles) and 
PAX6 (boxes) was measured by qRT-PCR. d, H1 hESCs described above were 
subjected to neural conversion, and abundance of telencephalon markers SIX3 
(circles) and FOXG1 (boxes) was measured by qRT-PCR. e, Expression of 


OCT4 was monitored by qRT-PCR during neural conversion in the presence 
or absence of KBTBD8. f, hESCs stably expressing control or KBTBD8 
shRNAs were subjected to neural conversion and analysed for expression of 
pluripotency (OCT4, CDH1), neural crest (SOX10, SNAIL2, AP2), or CNS 
precursor markers (PAX6) by western blotting. To provide consistency, 
samples were taken from the same experiment as shown in Fig. 5d (asterisks 
mark blots that are also shown in Fig. 5d). g, Loss of neural crest occurs in 
response to KBTBD8 depletion by two independent shRNAs, as shown 

by western blot analysis. h, hESCs were subjected to neural conversion and 
analysed by immunofluorescence microscopy against SOX10 (neural crest), 
PAX6 (CNS precursor), and OCT4 (pluripotency) (confocal, original 
magnification 20x). 
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Extended Data Figure 4 | KBTBD8 is required for differentiation into 
functional neural crest cells. a, H1 hESCs stably expressing control or 
KBTBD8 shRNAs were subjected to neural conversion for 43 days and analysed 
by immunofluorescence microscopy against GFAP (glia), smooth muscle actin 
(SMA; mesenchymal cells), and neurofilament L (neurons). b, Control H1 
hESCs or hESCs depleted of KBTBD8 were subjected to neural conversion for 
43 days and expression of markers for glia (GFAP), mesenchyme (smooth 
muscle actin, SMA), melanocytes (TYRP1, DCT), chondrocytes (COL2A1), or 


CNS derivatives (PAX6, NESTIN, neurofilament L) was analysed by (RT-PCR 
(mean of 3 technical replicates + s.e.m.). ¢, Xenopus tropicalis embryos were 
injected at the two-cell stage with splice-blocking morpholinos (sMO) 
against CUL3 or KBTBD8§, or with a dominant-negative construct of CUL3 that 
allows KBTBD8 to bind, but not ubiquitylate, substrates. Neural crest 
formation was monitored by SOX10 in situ hybridization. Quantification 
included experiment shown in Fig. 1d (mean of 3 biological replicates + s.d.; 
~20 embryos per condition and replicate). 
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Extended Data Figure 5 | Biochemical characterization of the substrate 
adaptor role of KBTBD8. a, Domain structure of KBTBD8, including the 
residues mutated to generate ubiquitylation- (Y74A) and substrate-binding- 
deficient KBTBD8 (F550A, W579A). b, Effects of point mutations in predicted 
KELCH domain loops on binding of KBTBD8 to candidate substrates were 
determined by affinity purification and western blot analysis. c, Effects of 
point mutations in BTB domain on binding of KBTBD8 to CUL3 were 
determined by affinity purification and western blotting. Dimerization of Flag— 
KBTBD8 with KBTBD8-HA was analysed in the same experiment to provide 
a folding control. d, Binding of recombinant CUL3 to immobilized 
recombinant MBP-KBTBD8 variants was analysed by Coomassie. e, Binding 
of in vitro-transcribed/translated *°S-NOLC1 to immobilized recombinant 


$3 50 
SL § 'mitants £0 | f}-ACTIN 
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KBTBD8 variants was analysed by autoradiography. f, Binding of in vitro- 
transcribed/translated *°S-TCOFI to immobilized recombinant KBTBD8 
variants was analysed by autoradiography. g, Endogenous f-arrestin proteins 
in reticulocyte lysates binds immobilized, recombinant KBTBD8, as detected 
by western blot analysis. h, 293T cells were transfected with control- or 
B-arrestin 1/2-siRNAs and reconstituted with Flag-KBTBD8. Binding of 
KBTBD8 to endogenous TCOF1 and NOLC1 was analysed by anti-Flag affinity 
purification and western blot analysis. i, Ubiquitylation of HA-TCOF1 in 
293T cells depleted of f-arrestin 1/2 and reconstituted with KBTBD8 was 
determined after denaturing Ni-NTA purification by western blotting as 
described above. j, Ubiquitylation of HA-NOLC1 was detected in 293T cells 
depleted of B-arrestins and reconstituted with KBTBD8, as described above. 
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Extended Data Figure 6 | KBTBD8 specifies neural crest fate through (mean of 3 technical replicates + s.e.m.). d, Depletion of TCOF1 or NOCL1 
TCOF1 and NOLC1. a, mRNA levels of KBTBD8, NOLC1 and TCOF1 were from hESCs results in loss of neural crest cells, as determined by triple staining 
determined in hESCs or differentiating cells transduced with lentiviruses immunofluorescence against the neural crest markers HNK1, TFAP2 and 
expressing the indicated shRNAs by qRT-PCR (mean of 3 technical p75 (n > 200 cells, mean of 3 biological replicates + s.d.). Scale bar, 10 im. 


replicates + s.e.m.). b, hESCs stably depleted of KBTBD8 and reconstituted e, hESCs were transduced with lentiviruses expressing control or BRD2 

with either wild-type KBTBD8, KBTBD8(W579A), or KBTBD8(Y74A) were — shRNAs, subjected to puromycin selection for 7 days, and analysed by western 
subjected to neural conversion (9 days) and analysed for the expression of blot analysis. f, Depletion efficiency for shRNAs against various KBTBD8 
marker proteins by qRT-PCR (mean of 3 technical replicates + s.e.m.). binding partners, as determined by qRT-PCR (mean of 3 technical 

c, hESCs stably depleted of KBTBD8, TCOF1, or NOLC1 were subjected to replicates + s.e.m.). 

neural conversion (9 days) and analysed for marker expression by (RT-PCR 
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NOLCI1. Scale bar, 10 um. ¢, Total spectral counts of proteins associated with was analysed by western blot. 
TCOF1-NOLCI1 complexes purified by sequential immunoprecipitation in the 
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Extended Data Figure 8 | KBTBD8 is not required for general ribosome 
biogenesis. a, hESCs stably depleted of KBTBD8 were subjected to neural 
conversion and levels of 5S rRNA, 18S rRNA and mRNAs encoding RPS6, 
RPS28, RPL10A and RPL28 were measured by qRT-PCR (mean of 3 technical 
replicates + s.e.m.). b, hESCs stably depleted of KBTBD8 were subjected to 
neural conversion, and total RNA was subjected to a bioanalyser assay to 
monitor processing of ribosomal RNAs. c, hESCs stably depleted of KBTBD8 
were subjected to neural conversion (3 days), and nucleoli were analysed by 
anti-fibrillarin (original magnification: 60 <, confocal) immunofluorescence 
microscopy. d, Quantification of nucleolar analysis described above (mean of 3 
technical replicates + s.e.m.). e, hESCs stably depleted of KBTBD8 were 
analysed for localization of 5.8S rRNA by anti-5.8S rRNA immunofluorescence 


microscopy (original magnification: 60, confocal). f, hESCs depleted of 
KBTBD8 were subjected to neural conversion and analysed by anti-5.8S rRNA 
immunofluorescence microscopy (original magnification: 60, confocal). 

g, Polysomes were purified from control or KBTBD8-depleted hESCs and 
differentiated counterparts subjected to neural conversion via sucrose gradient 
centrifugation followed by fractionation and UV detection. h, KBTBD8- 
depleted hESCs were subjected to neural conversion for 9 days and analysed for 
apoptosis by immunofluorescence analysis against cleaved caspase 3 (red) and 
DNA (Hoechst, blue). Cells with active caspase 3 staining were quantified 
(~200 cells per condition; scale bar, 10 jum) (original magnification: 40x, 
confocal). 
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Extended Data Figure 9 | Characterization of KBTBD8- and TCOF1- 
depleted hESCs during neural conversion. a, hESCs were treated with 
increasing concentrations of rapamycin, subjected to neural conversion for 

9 days, and analysed for expression of neural crest or CNS precursor markers by 
qRT-PCR. For comparison, effects of KBTBD8, TCOF1, or NOLC1 depletion 
(extracted from Fig. 3a) are shown. b, hESCs were treated with rapamycin, 
subjected to neural conversion, and analysed for marker expression by western 
blotting. c, hESCs were depleted of KBTBD8 or TCOF1, subjected to neural 
conversion for 3 days, and analysed for expression of 5S and 18S rRNA by 
qRT-PCR (mean of 3 technical replicates + s.e.m.). d, hESCs depleted of 
KBTBD8 or TCOF1 were subjected to neural conversion for 3 days and 


LETTER 


analysed for p53 activation by RNA-seq against p53 targets. e, hESCs were 
depleted of KBTBD8 or TCOF1, subjected to neural conversion for 3 days, and 
analysed for apoptosis by immunofluorescence microscopy against cleaved 
caspase 3. Quantification shown below (~200 cells per condition). f, hESCs 
depleted of KBTBD8 were subjected to neural conversion for 9 days and 
analysed for expression levels of 5S and 18S rRNA by qRT-PCR (mean of 3 
technical replicates + s.e.m.). g, hESCs stably depleted of NOLC1 or 
TCOFI were subjected to neural conversion for 9 days and analysed by 
immunofluorescence microscopy against cleaved caspase 3 (red) or DNA 
(Hoechst, blue). Quantification is shown below (~200 cells per condition; 
scale bar, 10 tm). 
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Extended Data Figure 10 | KBTBD8 controls translation. a, hESCs stably 
depleted of KBTBD8 were subjected to neural conversion for 3 days, and hESCs 
and differentiating cells were analysed by RNA deep sequencing and 
ribosomal profiling to determine translation efficiency. Distribution of 
translation efficiency changes for 7,725 mRNAs brought about by KBTBD8 
depletion is shown. b, hESCs stably depleted of either TCOF1 or KBTBD8 were 
subjected to neural conversion for 3 days, and translation efficiency was 
determined by RNA-seq and ribosome profiling. c, Translation efficiency blot 
of differentiating hESCs transduced with control or KBTBD8 shRNAs was 
labelled for significantly affected transcripts in general (blue), with links to CNS 
precursor formation (gold), or with links to neural crest formation (green). 
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d, hESCs stably depleted of KBTBD8 or TCOF1 were subjected to neural 
conversion for 3 days, and expression levels of indicated proteins were analysed 
by western blotting. e, hESCs stably depleted of KBTBD8 were subjected to 
neural conversion for 3 days, and levels of ATRX1 and PCM1 mRNA were 
determined by qRT-PCR (mean of 3 technical replicates + s.e.m.). f, hESCs 
stably depleted of KBTBD8 were subjected to neural conversion for 3 days, 
and protein stability of ATRX1 and PCM1 was determined by cycloheximide 
chase and western blotting (mean of 3 biological replicates + s.d., ATRX1 
and PCM1 levels were normalized relative to actin levels and 0h time point 
set to 100%). 
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Neutrophil ageing is regulated by the microbiome 


Dachuan Zhang", Grace Chen!, Deepa Manwani’, Arthur Mortha*?, Chunliang Xu'?, Jeremiah J. Faith, Robert D. Burk®, 


Yuya Kunisaki'’+, Jung-Eun Jang’, Christoph Scheiermann'*+, Miriam Merad*? & Paul S. Frenette 


Blood polymorphonuclear neutrophils provide immune protec- 
tion against pathogens, but may also promote tissue injury in 
inflammatory diseases’”. Although neutrophils are generally con- 
sidered to be a relatively homogeneous population, evidence for 
heterogeneity is emerging**. Under steady-state conditions, neu- 
trophil heterogeneity may arise from ageing and replenishment by 
newly released neutrophils from the bone marrow’. Aged neutro- 
phils upregulate CXCR4, a receptor allowing their clearance in the 
bone marrow®”, with feedback inhibition of neutrophil production 
via the IL-17/G-CSF axis®, and rhythmic modulation of the haema- 
topoietic stem-cell niche’. The aged subset also expresses low 
levels of L-selectin®’. Previous studies have suggested that in 
vitro-aged neutrophils exhibit impaired migration and reduced 
pro-inflammatory properties’. Here, using in vivo ageing ana- 
lyses in mice, we show that neutrophil pro-inflammatory activity 
correlates positively with their ageing whilst in circulation. Aged 
neutrophils represent an overly active subset exhibiting enhanced 
Om $2 integrin activation and neutrophil extracellular trap formation 
under inflammatory conditions. Neutrophil ageing is driven by the 
microbiota via Toll-like receptor and myeloid differentiation factor 
88-mediated signalling pathways. Depletion of the microbiota sig- 
nificantly reduces the number of circulating aged neutrophils and 
dramatically improves the pathogenesis and inflammation-related 
organ damage in models of sickle-cell disease or endotoxin-induced 
septic shock. These results identify a role for the microbiota in 
regulating a disease-promoting neutrophil subset. 

Neutrophils are a critical component of innate immunity. However, 
activated neutrophils can also promote certain diseases by secreting 
pro-inflammatory cytokines, and by interacting with other immune or 
blood cells”. For example, activation of 0% 82 (Mac-1) integrin enables 
adherent neutrophils to interact with platelets and red blood cells 
(RBCs)"". In sickle-cell disease (SCD), a severe blood disorder origin- 
ating from a single mutation in the B-globin gene (Hbb)”’, the capture 
of sickle RBCs by activated Mac-1 on adherent neutrophils leads to 
acute vaso-occlusion, resulting in life-threatening crises'!!*!*. 
Intravital microscopy analyses have revealed considerable heterogen- 
eity in Mac-1 activation on neutrophils recruited to the same venules 
of SCD mice, suggesting that subsets of neutrophils differ markedly in 
their pro-inflammatory activity". 

To investigate whether the heterogeneity in pro-inflammatory 
activity of neutrophils is associated with their ageing, we first validated 
that neutrophils progressively lost L-selectin (CD62L) expression and 
upregulated CXCR4 as they aged whilst in circulation® (Extended Data 
Fig. la, b). We analysed CD62L"° neutrophils in vivo using multi- 
channel fluorescence intravital microscopy (MFIM) analysis''!’. 
Notably, CD62L expression on adherent neutrophils in tumour nec- 
rosis factor « (TNFa)-inflamed post-capillary venules inversely corre- 
lated with Mac-1 activation (Fig. la, b), as determined using 
fluorescent microsphere beads that specifically bound to activated 
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Mac-1". In addition, a similar inverse correlation was observed in 
the ability of adherent neutrophils to capture RBCs (Fig. 1b). 

Next, we analysed neutrophil populations in mice lacking P-selectin 
(Selp ’ ~), an adhesion molecule essential for neutrophil recruitment 
under steady-state conditions™’’. We found that the CD62L'°CXCR4™ 
aged neutrophil population was dramatically expanded in Selp /~ 
mice, and neutrophils harvested from Selp ’~ mice showed significantly 
higher Mac-1 activity compared to those harvested from wild-type mice 
(Fig. 1c and Extended Data Fig. 1c, d). In addition, we analysed 
neutrophil populations after depletion of macrophages—which medi- 
ate neutrophil clearance'°—using animals expressing a knocked-in 
diphtheria toxin receptor at the Cd169 locus (CD169-DTR)"’. We 
found that the aged neutrophil population was significantly expanded 
without elevation of major inflammatory cytokines, and Mac-1 
activation was significantly increased on adherent neutrophils in 
macrophage-depleted mice (Fig. 1d and Extended Data Fig. le, f). 
These data suggest that CD62L'°CXCR4™ aged neutrophils exhibit 
enhanced Mac-1 activation during inflammation. 

To evaluate the specificities of ageing versus the activation of an 
inflammatory program, we compared the transcriptome of control, 
aged and TNFa-activated neutrophils. We transfused whole blood 
and collected donor neutrophils 6h later to derive in vivo-aged neu- 
trophils, and compared them to control neutrophils that were trans- 
ferred for only 10 min. Additionally, we obtained neutrophils from 
TNFa-treated mice for comparison with neutrophils activated by sys- 
temic inflammation. Gene set enrichment analyses’* revealed that aged 
neutrophils differed from activated neutrophils in many aspects, such as 
cytokine and chemokine secretion, Ras and P38/MAPK signalling path- 
ways (Fig. le). However, aged neutrophils upregulated several pathways 
that were also enhanced during neutrophil activation, including integ- 
rin and leukocyte adhesion, Toll-like receptor (TLR) and NOD-like 
receptor (NLR), and NF«B signalling pathways (Fig. le, f and 
Extended Data Table 1, 2). Analysis of surface antigens revealed that 
aged neutrophils exhibited significantly higher levels of TLR4 and 
molecules involved in cell migration and intercellular interactions, 
including CD11b, CD49d, and Icam1 (Fig. 1g and Extended Data Fig. 
1g-i). These results demonstrate that neutrophils constitutively receive 
priming signals and become more active as they age in circulation. 

The upregulation of several inflammatory pathways in aged neutro- 
phils suggests a contribution by exogenous inflammatory mediators. 
Microbiota-derived molecules may cross the intestinal barrier to exert 
systemic influences, affecting multiple immune populations including T 
cells, innate lymphoid cells and macrophages’””’. Recent studies suggest 
that neutrophil production and the phagocytic capacity of bone-marrow- 
derived neutrophils may be regulated by the microbiota”'™, raising the 
possibility that these factors also influence the ageing process of circulat- 
ing neutrophils. 

We sought to test this hypothesis by treating mice with broad- 
spectrum antibiotics (ABX) for 4-6 weeks’’”*, which led to highly 
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Figure 1 | Aged neutrophils represent an overly active subset of neutrophils. 
a, MFIM analysis of CD62L expression (red) and Mac-1 specific albumin- 
coated fluorescent microsphere beads (green) captured by adherent neutrophils 
(dashed lines). Left, fluorescence channel; right, fluorescence combined with 
brightfield channels. MFI, mean fluorescence intensity. Scale bar, 10 tm. 

b, Correlation between CD62L expression and bead capture or neutrophil- 
RBC interaction (n = 126 cells from 3 mice). c, d, Flow cytometry analysis of 
CD62L'°CXCR4™ aged neutrophils and MFIM analysis of Mac-1 activation on 
neutrophils from wild-type (WT) and Selp / ~ mice (c; middle, n = 5, 6 mice, 
respectively; right, n = 3, 4 mice, respectively), or in diphtheria-toxin-treated 
wild-type and CD169-DTR mice (d; middle: n = 7 mice; right: n = 8 mice). 


efficient depletion and dramatic alterations in the composition of the 
gut microbiota (Extended Data Fig. 2a-d). Microbiota depletion 
resulted in significant and selective reductions of neutrophil numbers 
in circulation and bone marrow, and a significant reduction in spleen 
cellularity with decreased numbers of multiple leukocyte populations 
(Extended Data Fig. 3a—-d). Notably, both the percentages and num- 
bers of aged neutrophils were significantly reduced in ABX-treated 
mice, and the numbers were completely restored when the TLR4 
ligand lipopolysaccharide (LPS) was added back by intragastric gavage 
(Fig. 2a). We further analysed neutrophil-LPS interaction by admin- 
istering fluorescently labelled LPS, and found that as soon as 1h after 
LPS gavage, specific fluorescence signals were detectable on neutro- 
phils in circulation, the spleen and bone marrow (Extended Data 
Fig. 3e). In addition, we found that the reintroduction of the TLR2 
ligand peptidoglycan, but not the NOD1/2 activator mTriDAP, could 
also restore the numbers of aged neutrophils in ABX-treated mice 


e, Heat map of normalized enrichment scores (NES) for selected pathways in 
aged and TNFw-activated neutrophils, as compared to control neutrophils 

(n = 3 mice). Red, upregulation; blue, downregulation. f, g, Cxcl2, Itgam and 
Tlr4 mRNA expression levels determined by quantitative PCR in control and 
aged neutrophils (f; left, n = 5, 6 mice, respectively; middle, n = 6, 4 mice, 
respectively; right, n = 6, 5 mice, respectively), and surface expression levels of 
TLR4 and selected adhesion molecules determined by flow cytometry on 
CD62L!° aged and CcbD62L"% young neutrophils (g; left, n = 5 mice; right, n = 4 
mice). Error bars, mean + s.e.m. *P < 0.05, **P < 0.01, ***P < 0.001, data 
representing two or more independent experiments analysed with one-way 
ANOVA (b) or unpaired Student’s t-test (c, d, f, g). 


(Extended Data Fig. 3f), suggesting that multiple microbiota-derived 
molecules may contribute to neutrophil ageing under steady-state 
conditions. 

To validate that the microbiota could indeed regulate neutrophil 
ageing, we analysed neutrophil populations in germ-free mice. 
Compared to specific-pathogen-free animals, germ-free mice exhib- 
ited broad alterations in both innate and adaptive immune cells 
(Extended Data Fig. 4a—c). Consistently, the numbers of total and aged 
neutrophils were significantly reduced in germ-free mice, and the 
numbers were partially restored when germ-free mice were reconsti- 
tuted by fecal transplantation. In addition, treating germ-free mice 
with ABX did not further reduce their aged neutrophil numbers 
(Fig. 2b and Extended Data Fig. 4d, e). Furthermore, we transfused 
whole blood obtained from wild-type donor mice into wild-type, ABX- 
treated or germ-free recipients. The percentages of chronologically 
aged donor neutrophils progressively increased after the transfusion, 
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Figure 2 | Neutrophil ageing is driven by the microbiota. a, Flow cytometry 
analysis of aged neutrophils in control, antibiotics (ABX)-treated mice and 
ABX-treated mice fed with LPS (n = 12,7, 5 mice, respectively). b, Numbers of 
aged neutrophils in specific-pathogen-free (SPF) mice, germ-free (GF) mice, 
germ-free mice reconstituted by fecal transplantation (GF-FT), and germ-free 
mice treated with antibiotics (GF-ABX) (n = 5, 5, 5, 3 mice, respectively). 

c, Ageing kinetics of donor neutrophils after adoptive transfer into control, 
ABX-treated or germ-free recipients (n = 4 mice). d, EdU pulse-chase analysis 
of neutrophil release-clearance kinetics in control and ABX-treated mice 
(Ctrl, n = 5, 5, 6, 8, 5 mice; ABX, n = 5, 5, 5, 6, 6 mice for days 3, 4, 5, 6 and 7, 
respectively). e-g, Representative images (e) and quantification of neutrophil 
adhesion (dotted lines; f) and Mac-1 activation on adherent neutrophils 

(g) in control and ABX-treated mice (n = 4 mice). Scale bar, 10 um. Error bars, 
mean + s.e.m. *P < 0.05, **P < 0.01, ***P < 0.001, data representing two or 
more independent experiments analysed with unpaired Student’s t-test. 


but at a significantly slower rate in ABX-treated and germ-free 
recipients (Fig. 2c). These results strongly suggest that neutrophil 
ageing is delayed in a bacterially depleted environment. 

Since alternations in neutrophil clearance may influence the num- 
ber of aged neutrophils (Fig. 1c, d), we investigated whether ABX 
treatment modulates neutrophil ageing by acting on clearance 
mechanisms. We first analysed adhesion molecule expression on 
endothelial cells and observed no difference between control and 
ABX-treated mice (Extended Data Fig. 5a). Next, we analysed macro- 
phage numbers in the spleen, bone marrow and liver, the organs that 
clear neutrophils”. We found that macrophage numbers decreased by 
~36% in the spleen, increased by ~30% in the bone marrow, and did 
not change in the liver of ABX-treated mice (Extended Data Fig. 5b, c). 
Furthermore, we depleted macrophages in ABX-treated mice using the 
CD169-DTR model, and found that ABX treatment significantly 
reduced aged neutrophil numbers in macrophage-depleted animals 
(Extended Data Fig. 5c, d), suggesting that microbiota-driven ageing 
and macrophage-mediated clearance are independent mechanisms 
that regulate the number of aged neutrophils. We then analysed the 
release-clearance kinetics of circulating neutrophils using a 5-ethynyl- 
2'-deoxyuridine (EdU) pulse-chase labelling strategy’®. We observed 
significantly more EdU~ neutrophils remaining in circulation on day 7, 
suggesting a delayed clearance in ABX-treated mice (Fig. 2d). In 
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Figure 3 | Microbiota-driven neutrophil ageing is mediated by neutrophil 
TLRs and Myd88 signalling. a, Percentages of aged neutrophils in wild-type 
and LysM-cre/Myd88™' mice, as analysed by flow cytometry (n = 12, 10 mice, 
respectively). b, Percentages of aged neutrophils in wild-type and Tnf “~ or 
Csf2-‘~ mice (n = 5 mice). KO, knockout. c, Ageing kinetics of donor 
neutrophils after adoptive transfer from either wild-type or LysM-cre/Mydss™" 
mice into wild-type or LysM-cre/Myd88™" recipients (n = 6 mice). 

d, Percentages of the aged subset in wild-type and LysM-cre/Myd88™, Tlr4-/— 
or Tir2/— neutrophils in chimaeric mice (n = 5 mice). e, MFIM analysis of 
Mac-1 activation on wild-type and LysM-cre/Myd88™, Tlr4~/~ or Tlr2-/— 
neutrophils in chimaeric mice (n = 5 mice). f, Representative images showing 
wild-type (CD45.1~, blue) and Tira’ (CD45.2*, red) neutrophils and beads 
(green). Scale bar, 10 um. Error bars, mean + s.e.m. *P < 0.05, **P < 0.01, 
***P < 0.001, data representing two or more independent experiments 
analysed with unpaired Student’s t-test (a-c) or paired Student’s t-test (d, e). 


addition, we investigated the functional impact of microbiota depletion 
using intravital microscopy, and observed significant reductions in 
neutrophil adhesion and Mac-1 activation in ABX-treated compared 
to control mice (Fig. 2e-g). These data suggest that neutrophil ageing, 
which leads to the generation of a functionally overly active subset of 
neutrophils, is driven by the microbiota. 

Neutrophils express multiple pattern recognition receptors, including 
TLR2 and TLR4”, which may directly transduce microbiota-derived 
signals. Alternatively, microbiota-derived signals may stimulate certain 
immune cells to secrete pro-inflammatory cytokines such as TNFa and 
granulocyte-macrophage colony-stimulating factor (GM-CSF)'*”’, 
which could in turn prime circulating neutrophils. To investigate how 
microbiota-derived signals regulate neutrophil ageing, we characterized 
aged neutrophils in LysM-cre/Myd88™ mice, in which Myd88, a sig- 
nalling molecule that mediates most TLR signalling, is specifically 
deleted in myeloid cells. Notably, we observed significant reductions 
in the percentages and numbers of aged neutrophils in these mice 
(Fig. 3a and Extended Data Fig. 5e). Similarly, we also found significant 
reductions of aged neutrophils in TLR2- and TLR4-knockout mice 
(Extended Data Fig. 5f). By contrast, the aged neutrophil population 
was expanded in TNFa- and GM-CSF-knockout mice (Fig. 3b and 
Extended Data Fig. 5g), suggesting that the microbiota may not drive 
neutrophil ageing by stimulating TNFa or GM-CSF secretion. 

To further delineate how Myd88 mediates microbiota-driven age- 
ing, we adoptively transferred either LysM-cre/Myd8a! " or wild-type 
neutrophils into wild-type or LysM-cre/Myd88™ recipient mice, and 
analysed donor neutrophil ageing in vivo. Interestingly, neutrophil 
ageing was almost completely abrogated when Myd88-deficient neu- 
trophils were transferred into wild-type recipients, whereas the ageing 
kinetics remained largely unaffected when wild-type neutrophils were 
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Figure 4 | Microbiota depletion reduces vaso-occlusive events in sickle-cell 
disease. a, Flow cytometry analysis of aged neutrophils in hemizygous control 
(SA), control SCD (SS ctrl) and antibiotic-treated SCD (SS ABX) mice (n = 8, 
9, 9 mice, respectively). b, MFIM analysis of neutrophil adhesion and Mac-1 
activation on adherent neutrophils in SA, SS control and SS ABX mice (n = 6, 8, 
10 mice, respectively). Scale bar, 10 um. ¢, Mac-1 activation on adherent 
neutrophils and neutrophil-RBC interaction in SA, SS control and SS ABX 
mice (left, n = 4, 3, 3 mice, respectively; right, n = 69, 42, 54 vessels from 6, 

7 and7 mice, respectively). d, Blood flow and survival time of SS control and SS 
ABX mice in acute vaso-occlusive crisis (left, n = 36, 48 vessels from 7 and 8 
mice, respectively; right, n = 6 mice). e, Representative images and weights of 
spleen in SS control and SS ABX mice (n = 6 mice). f, Haematoxylin and eosin 
staining showing liver damage in SS control and SS ABX mice. Arrow, liver 
fibrosis; arrowheads, necrosis and inflammation. Scale bars, 50 um. g, Survival 
time of SS mice treated with PBS- or clodronate-encapsulated liposome 

(lipo; n = 7, 8 mice, respectively). h, Numbers of total and aged neutrophils in 
healthy controls, SCD patients (SS), and SCD patients on penicillin V 
prophylaxis (SS-PV; n = 9, 23,11 subjects, respectively). Error bars, 

mean + s.e.m. *P < 0.05, **P < 0.01, ***P < 0.001, data representing two 

or more independent experiments analysed with unpaired Student’s f-test 
(a-d (left), e, h) or log-rank test (d (right), g). 


transferred into LysM-cre/Myd88™" recipients (Fig. 3c). Next, we gen- 
erated chimaeric mice reconstituted with a mixture of wild-type and 
Myd88-, TLR4- or TLR2-deficient bone marrow cells, which enabled 
us to compare wild-type and deficient neutrophils in the same mouse, 
thus avoiding potential differences caused by the environment. 
Consistently, we observed significantly lower percentages of the aged 
subset in Myd88-, TLR2- and TLR4-deficient neutrophils, compared 
to wild-type neutrophils in the same chimaeric mice (Fig. 3d). By 
contrast, the percentages of total neutrophils in Myd88-, TLR2- and 
TLR4-deficient leukocytes were unaltered or expanded (Extended 
Data Fig. 5h), suggesting a specific effect of TLR and Myd88 deficiency 
on ageing, but not the generation, of neutrophils. We also subjected 
these chimaeric mice to intravital microscopy, and found significantly 
lower Mac-1 activity on Myd88-, TLR4- and TLR2-deficient 
neutrophils (Fig. 3e, f and Extended Data Fig. 5i). These findings 
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strongly suggest that neutrophil TLRs and Myd88 signalling mediate 
microbiota-driven neutrophil ageing. 

In addition to analysing Mac-1 activation, we investigated whether 
ageing affects the ability of neutrophils to form neutrophil extracellular 
traps (NET) in response to pathological stimulation. We enriched 
aged neutrophils by injecting antibodies to block P- and E-selectins’, 
and found that neutrophils obtained from anti-selectin-treated mice 
exhibited significantly increased reactive oxygen species (ROS) pro- 
duction (Extended Data Fig. 6a, b). We induced NET formation by 
treating neutrophils with LPS in vitro’, and quantified NETs based on 
the co-localization of DNA with citrullinated histone H3 (CitH3) and 
neutrophil elastase”. We found that NET formation was significantly 
increased in neutrophils collected from anti-selectin-treated mice. By 
contrast, neutrophils isolated from ABX-treated mice exhibited a 
marked reduction in NET formation (Extended Data Fig. 6c, d). 

To investigate whether microbiota depletion impacts NET forma- 
tion in vivo, we challenged control and ABX-treated mice with a lethal 
dose of LPS to induce septic shock’*, and injected fluorophore- 
conjugated antibodies to image NETs in the liver vasculature. 
Notably, we observed a significant reduction in the number of NETs 
formed in septic liver after microbiota depletion, and dramatic 
decreases in soluble NET biomarkers—plasma nucleosome and 
plasma DNA (Extended Data Fig. 7a, b). In addition, immunofluores- 
cence analyses revealed that the septic liver vasculature contained numer- 
ous aggregated CitH3* neutrophils, which was commonly found to be 
associated with fibrin deposition (Extended Data Fig. 7c, d). By contrast, 
CitH3* neutrophils, neutrophil aggregates and fibrin deposition were 
markedly reduced in ABX-treated mice, leading to significantly pro- 
longed survival of these mice (Extended Data Fig. 7c-g). Remarkably, 
this improvement in survival was abrogated by infusing aged neutrophils, 
but not by infusing the same number of young neutrophils back into 
ABX-treated mice (Extended Data Fig. 7g). 

To assess further the role of neutrophil ageing in a disease model, we 
analysed mice with SCD, a disease characterized by recurrent episodes 
of vaso-occlusion in which neutrophils play a primary function''”*. 
While SCD mice exhibited significant expansion of all major leukocyte 
subsets compared to hemizygous mice, ABX-mediated microbiota 
depletion led to a significant and selective reduction of neutrophils, 
but not other leukocyte populations (Extended Data Fig. 8a). Notably, 
the numbers of aged neutrophils were expanded more than tenfold in 
SCD mice, and the expansion was completely abrogated by microbiota 
depletion (Fig. 4a). 

To investigate the functional impact of reduced aged neutrophil 
numbers in disease outcome, we challenged hemizygous, untreated 
and ABX-treated SCD mice with TNFa and evaluated the cremaster 
microcirculation by intravital microscopy'’'*. SCD mice exhibited 
significant increases in neutrophil adhesion, Mac-1 activation and 
heterotypic interactions with RBCs compared to hemizygous mice, 
all of which were markedly reduced by microbiota depletion 
(Fig. 4b, c and Extended Data Fig. 8b, c), resulting in enhanced blood 
flow and significantly improved survival of ABX-treated SCD mice 
(Fig. 4d). Interestingly, the splenomegaly of SCD mice was signifi- 
cantly reduced, and liver damage including fibrosis, necrosis and 
inflammation was dramatically alleviated in ABX-treated SCD mice 
(Fig. 4e, fand Extended Data Fig. 8d, e). To test the impact of impaired 
neutrophil clearance in SCD, we depleted macrophages in the liver, 
spleen and bone marrow using clodronate liposomes’’. Macrophage 
depletion markedly increased circulating aged neutrophils (data not 
shown) and resulted in acute vaso-occlusive crises that led to the death 
of all mice within 10-30h (Fig. 4g). Together, these data suggest that 
the microbiota regulates aged neutrophil numbers, thereby affecting 
both acute vaso-occlusive crisis and the ensuing chronic tissue damage 
in SCD. 

Finally, we evaluated whether the numbers of circulating aged neu- 
trophils were altered in patients with SCD. As penicillin V antibiotic 
prophylaxis therapy is recommended for children less than 5 years old 
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or older patients with immune defects to prevent life-threatening 
infections’, we determined aged neutrophil numbers in this patient 
population (Extended Data Fig. 8f, g). While there was no significant 
difference in total neutrophil numbers between SCD patients and 
healthy subjects, patients in the penicillin V prophylaxis group had 
significantly lower total neutrophil numbers (Fig. 4h). Consistent with 
our results in the mouse model, we found that SCD patients exhibited a 
marked increase in the numbers of circulating aged neutrophils com- 
pared to healthy controls. Remarkably, we observed significant reduc- 
tions in both percentages and numbers of aged neutrophils in patients 
on penicillin V prophylaxis compared with SCD patients not taking 
antibiotics (Fig. 4h and Extended Data Fig. 8f). Age differences, gender 
or hydroxyurea intake in this case-control study did not mitigate the 
effect of antibiotic treatment on aged neutrophil numbers (Extended 
Data Fig. 8h), although a prospective study with age-matched subjects 
will be needed to ascertain the independent value of antibiotics in 
controlling aged neutrophil numbers in SCD. 


Neutrophils are among the shortest-lived cells in the body’. 


However, the evolutionary forces behind their rapid turnover remain 
unclear. Our results suggest that signals from the microbiota, through 
TLRs and Myd88, gradually lead them to become more functionally 
active. These data thus emphasize the notion that immunity is main- 
tained by a balanced activation resulting from encounters with the 
microbiota, and provide a possible explanation for the evolutionary 
pressure that maintains an energy-consuming short lifespan as a 
mechanism to fine-tune the proportion of highly active neutrophils 
while balancing the risk of tissue injury. To our knowledge, this is the 
first therapy shown to alleviate the chronic tissue damage induced by 
SCD. Although antibiotic therapy normalized the overly active aged 
neutrophil population in SCD patients, the extent to which this treat- 
ment affects the gut microbiota or vaso-occlusive disease is open to 
future investigations. Our results raise the possibility that manipula- 
tion of the microbiome may have sustained implications in disease 
outcome that should be further studied in clinical trials. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 

Mice. Selp ‘~, CD169-DTR, Csf2’~ mice, Tg[Hu-miniLCRal¢y*ydp°] Hba 
Hb ~ (Berkeley sickle-cell mice) and Tg[Hu- -miniLCRal@y*ydp*] Hba /~ 
Hbb*’~ (hemizygous control mice) have been described previously!” 
B6.129P2-Lyz2""""7} (LysM-cre), B6.129P2(SJL)-Mydss 7] (Myase™") 
and B6;129S- Taf Ky (Tnf- ‘~) mice were purchased from The Jackson 
Laboratory. Tir2-’~ and Tlr4-/~ mice were kindly provided by E. G. Pamer 
(Memorial Sloan-Kettering Cancer Center, NY). C57BL/6 CD45.1 and CD45.2 
mice were purchased from the National Cancer Institute. Six- to eight-week-old 
mice were used for experiments. All mice were housed in specific-pathogen-free 
conditions and fed with autoclaved food; experimental procedures performed on 
mice were approved by the Animal Care and Use Committee of Albert Einstein 
College of Medicine. Germ-free C57/BL6 mice were maintained in sterile isolators 
with autoclaved food and water in the Gnotobiotic Core of Icahn School of Medicine 
at Mount Sinai. For fecal transplantation experiments, 100 mg of feces pellets was 
resuspended in 1 ml of PBS, homogenized, and filtered through a 70-1m strainer. 
Recipient germ-free mice were gavaged with 200 ul of the filtrate. 

Human samples. Blood was obtained from healthy volunteers, SCD patients and 
SCD patients on penicillin V prophylaxis after parental consent and child assent as 
approved by the Institutional Review Board of Albert Einstein College of 
Medicine. SCD patients were recruited upon routine visits at the sickle-cell clinic 
of Montefiore Medical Center. Among the 34 patients recruited for the study, 11 
were on penicillin V owing to age (less than 5 years old) or defective immunity, and 
23 were off antibiotic treatment for at least 2 months. Patients with acute infection 
or vaso-occlusive crisis were excluded from the study. 

Bone marrow transplantation. Age- and gender-matched sickle-cell disease (SS) 
and control hemizygous (SA) mouse cohorts were generated by transplanting 
bone marrow nucleated cells from Berkeley sickle-cell mice or control hemizygous 
mice into lethally irradiated C57BL/6 mice as described before'"*. Fully recon- 
stituted mice (> 97%) were used for studies. Chimaeric mice used to study neu- 
trophil TLRs were generated by transplanting a 1:1 mixture of bone marrow 
nucleated cells from C57BL/6 mice (CD45.1*) and LysM-cre/Myds8"" or 
Tira or Tlr2-’~ mice (CD45.2*) into lethally irradiated C57BL/6 recipients 
(CD45.1*). Chimaeric mice were analysed 6 weeks after transplantation. 
Antibiotic treatment. Wild-type or SCD mice were treated with ampicillin 
al gl’), neomycin (1 gl’), metronidazol (1 gl’) and vancomycin (1 gl!) in 
drinking water for 4-6 weeks. Antibiotics were purchased from Sigma or Jack D. 
Weiler Hospital of the Albert Einstein College of Medicine. Drinking water con- 
taining antibiotics was changed every 3-4 days. For microbial product reintroduc- 
tion experiments, ABX-treated wild-type mice were fed with 1 mg LPS (0111:B4, 
Sigma), 1 mg peptidoglycan (PGN-SA, Invivogen) or 1 mg MurNAc-L-Ala-y-p- 
Glu-mDAP (M-TriDAP, Invivogen) by intragastric gavage, and allowed to rest for 
24-36 h. For the analysis of neutrophil-LPS interaction, untreated wild-type mice 
were fed with 300 jig LPS-FITC (Sigma) by intragastric gavage, and tissues were 
harvested 1h after gavage. 

Adoptive transfer. For in vivo neutrophil ageing analysis, whole blood from 
donor mice was transfused into recipient mice by retro-orbital injection. Donor 
neutrophils in blood were tracked based on CD45.1 and CD45.2 expression by 
flow cytometry. For microarray analyses, control and aged neutrophils were 
derived by in vivo ageing for 10 min and 6h, respectively. Activated neutrophils 
were harvested from mice injected with 0.5 jg TNFa (R&D Systems) for 2h. To 
analyse Mac-1 activation of neutrophils from wild-type and Selp’~ mice, 
3-5 X 10° leukocytes were harvested from blood following red blood cell (RBC) 
lysis, and were labelled with red fluorescent dye PKH26 (Sigma) according to the 
manufacturer’s protocol before the transfer into recipient mice. 

Macrophage depletion. For the depletion of CD169* macrophages, wild-type or 
CD169-DTR mice were injected intraperitoneally with two doses of 10 1gkg * 
body weight diphtheria toxin (Sigma) 3 days apart. Mice were analysed 5 days after 
the second injection. For depletion of macrophages in SCD mice, mice were 
injected intravenously with 250 ul PBS- or clodronate-encapsulated liposomes 
(the Foundation Clodronate Liposomes) as described before’’. 

Flow cytometry and cell sorting. Cells were surface-stained in PEB buffer 
(PBS supplemented with 0.5% BSA and 2mM EDTA) for 20-30 min on ice. 
Multiparametric flow cytometric analyses were performed on a LSRII equipped 
with FACS Diva 6.1 software (BD Biosciences) and analysed with FlowJo software 
(Tree Star). Dead cells were excluded by FSC, SSC and 4’,6-diamino-2-phenylin- 
dole (DAPI, Sigma) staining. Cell sorting experiments were performed on Aria 
Cell Sorter (BD Biosciences). Neutrophils were gated by Gr-1>! CD15” ssc" 
T cells, B cells and monocytes were gated by CD3*, CD4* and CD115"% aged 
neutrophils were gated by CD62L'° CXCR4"™ within the neutrophil population: 
haematopoietic progenitor and stem cells were identified by lineage cocktail, 
Sca-1, KitL, CD150, CD48, CD34 and CD 16/32, as previously described*'; macro- 
phages were identified by Gr-1'° CD115"° F4/80* SSC” as previously described”. 
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Fluorophore-conjugated or biotinylated antibodies against mouse Gr-1 
(RB6-8C5), CD115 (AFS98), CD3 (145-2C11), B220 (RA3-6B2), PE-anti- 
CXCR4 (2B11), CD45.1 (A20), CD45.2 (104), CD11b (M1/70), ICAM-1 (YN1/ 
1.7.4), CD11c (N418), CD49d (R1-2), CD45 (30-F11), Sca-1 (D7), c-Kit (2B8), 
CD34 (RAM34), and F4/80 (BMB8) were from eBioscience. Antibodies specific to 
CD62L (MEL-14) and Biotin Mouse Lineage Panel (TER-119, RB6-8C5, RA3- 
6B2, M1/70, 145-2C11) were from BD Pharmingen. Antibodies against CD47 
(miap301), CD150 (TC15-12F12.2), CD48 (HM48-1) and CD16/32 (93) were 
from BioLegend. 

Blood leukocyte counts. Blood was harvested from the retro-orbital plexus, col- 
lected in EDTA or heparin, and analysed on an ADVIA 120 hematology system 
(SIEMENS). 

Brightfield intravital microscopy. Experimental procedures and data analyses 
were performed as previously described''’*"’. In brief, male mice were injected 
intrascrotally with 0.5 jg TNFa (R&D Systems), and were anaesthetized 2 h later 
by intraperitoneal injection of a mixture of 2% chloralose (Sigma) and 10% ureth- 
ane (Sigma) in PBS. Tracheal intubation was performed to ensure normal respira- 
tion after anaesthesia. The cremaster muscle was gently exteriorized, and mounted 
onto a microscopic stage, and then superfused with Ringer solution (pH 7.4, 
37°C). Under the microscope, leukocyte rolling, adhesion, transmigration and 
neutrophil-RBC interactions in post-capillary venules (20-40 1m in diameter) 
were captured using a custom-designed upright microscope (MM-40, Nikon) 
equipped with a 60 X water-immersion objective (Nikon). Adhesion was quan- 
tified as the number of leukocytes remaining stationary for more than 20 s within a 
100 um venular segment. In this model, more than 90% of adherent leukocytes 
are neutrophils based on previous studies'*. Neutrophil-RBC interactions were 
defined as the associations between an RBC and an adherent leukocyte for more 
than 3s, and quantified as the number of interactions within a 100 jim vessel 
segment per minute. At least eight vessels from each animal were recorded and 
analysed using a charge-coupled video camera (Hamamatsu) and video recorder 
(Sony SVHS, SVO-9500). Venular diameters were measured using a video caliper, 
and centerline red cell velocity (Vpgc) for each venule recorded was measured 
using an optical Doppler, velocimeter (Texas A&M). Blood flow rate (Q) was 
calculated as Q= Vine X 7 , where d is venule diameter, Vinean is estimated as Vans, 
Multi-channel thiskewence intravital microscopy. MFIM analyses of Mac. 1 
activation of adherent neutrophils were performed as previously described". In 
brief, yellow-green fluorescent microsphere beads (1.0 um, Life Technologies) 
were incubated with 1 mg ml‘ BSA (Fisher Bioreagents) for at least 2h in PBS 
and sonicated for 15 min in a water-bath sonicator (Laboratory Supplies Co.). 
Albumin-coated beads (10° beads per mouse) were injected into mice prepared 
for intravital microscopy 3h after TNF« injection. For measurement of CD62L 
expression on adherent neutrophils, mice were intravenously injected with 4 ug 
APC-anti-CD62L (clone MEL-14, BD Pharmingen). For in vivo staining of CD45 
alleles, mice were intravenously injected with 5-10 ug Alexa Fluor 647-anti- 
CD45.2 (clone 104, Biolegend) and_ biotin-anti-CD45.1 (clone A20, 
eBioscience), and 20min later injected with 5-10 jg streptavidin eFluor 450 
(eBioscience). Images and videos were captured using an Axio Examiner.D1 
microscope (Zeiss) equipped with a Yokogawa CSU-X1 confocal scan head with 
four stack laser system (405 nm, 488 nm, 561 nm, and 642 nm wavelengths) and a 
60 X water-immersion objective, and analysed using Slidebook software 
(Intelligent Imaging Innovations). Mac-1 activation of adherent neutrophils was 
quantified as the average number of beads captured by each adherent neutrophil in 
post-capillary venules, and the percentage of cells that captured more than eight 
beads. 

Intravital microscopic studies of SCD mice. Male mice were injected intraper- 
itoneally with 0.5 1g TNFa (R&D Systems), and 2h later neutrophil responses 
were analysed as described above. Survival times, defined as the time from TNFa 
injection until death, were recorded. 

Microarray analysis. Total RNA from 2,000 sorted neutrophils was extracted 
using RNeasy Plus Micro Kit (Qiagen) according to the manufacturer’s protocol. 
All further steps were performed at the Genomics Core Facility at Albert Einstein 
College of Medicine. RNA was amplified using Ovation One-Direct System 
(NuGEN). Amplified cRNA was labelled with the GeneChip wild-type terminal 
labelling kit (Affymetrix), hybridized to Mouse Gene ST 1.0 microarrays 
(Affymetrix), and scanned by GeneChip Scanner 3000 7G system (Affymetrix) 
according to standard protocols. Raw data were normalized by RMA algorithm 
and analysed using the Gene Pattern analysis platform (Broad Institute). After 
removal of unannotated genes, gene set enrichment analysis was performed with 
all C2 gene sets from the Molecular Signatures Database (v5.0, Broad Institute). 
Gene sets with a P value of < 0.05 in either aged or activated groups were con- 
sidered to have significant differences compared to control group. Normalized 
enrichment score (NES) for selected pathways related to neutrophil functions were 
depicted as a heat map, with gene sets clustered by functional classifications. 
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Quantitative real-time PCR (qPCR). Messenger RNA extraction from 500- 
2,000 sorted neutrophils using Dynabeads mRNA DIRECT Kit (Life 
Technologies) and reverse transcription using RNA to cDNA EcoDry Premix 
(Clontech) were preformed according to the manufacturer’s protocols. qPCR 
was performed with SYBR GREEN (Roche) on ABI PRISM 7900HT Sequence 
Detection System (Life Technologies). The PCR protocol started with one cycle at 
95°C (10 min) and continued with 40 cycles at 95°C (15s) and 60°C (1 min). 
Expression of glyceraldehyde-3-phosphate dehydrogenase (Gapdh) was used as a 
standard. The average threshold cycle number (Cy) for each tested gene was used 
to quantify the relative expression of each gene: 2(Ctsandara) — Crieene)) Primers include: 
Gapdh forward, TGTGTCCGTCGTGGATCTGA; Gapdh reverse, CCTGC 
TTCACCACCTTCTTGA; Tir4 forward, ATGGCATGGCTTACACCACCG; Tir4 
reverse, GAGGCCAATTTTGTCTCCACA; Icam1 forward, GGACCAC 
GGAGCCAATTTC; Icam1 reverse, CTCGGAGACATTAGAGAACAATGC; 
Itgam forward, CTGAACATCCCATGACCTTCC; Itgam reverse, GCCCAAGG 
ACATATTCACAGC; Cxcl2 forward, CGCTGTCAATGCCTGAAG; Cxcl2 
reverse, GGCGTCACACTCAAGCTCT. 

ELISA. Concentrations of IFN-y, IL-6, TNFo and IL-1 were measured in plasma 
from wild-type and CD169-DTR mice 5 days after diphtheria toxin treatment 
using ELISA kits (eBioscience) according to the manufacturer’s instructions. 
For measurement of NET biomarkers, plasma nucleosome was measured using 
Cell Death Detection ELISA Plus kit (Roche), and plasma DNA was measured 
using Sytox Green (Life Technologies) as previously described”. 

16S rDNA quantification. Stool pellets were collected and total DNA was 
extracted using the QlAamp Fast DNA Stool Mini Kit (Qiagen). Quantification 
of 16S rDNA was performed by real-time qPCR using TaqMan Universal Master 
Mix (Life Technologies) and the following primers and probe as described™: 
forward, 5’-ACTGAGAYACGGYCCA-3’; reverse, 5’-TTACCGCGGCTGC 
TGGC-3’; Probe 6-FAM-ACTCCTACGGGAGGCAGCAGT-BHQI1. 
Taxonomic microbiota analysis. Taxonomic microbiota analysis was performed 
by the Molecular Biology and Next Generation Technology Core at Albert Einstein 
College of Medicine. In brief, purified 16S rDNA was used for PCR amplification 
and sequencing. The variable region 4-6 (V4-V6) of the 16S rDNA gene was 
amplified using barcoded 16V6R1052 and 16SV4F515 primers. Sequencing was 
performed using paired-end Illumina MiSeq sequencing. Taxonomical classifica- 
tion was obtained using the RDP-classifier to generate an OTU table, and the 
percentage of each family genus in total microbiome was derived from the OTU 
values. 

Neutrophil release-clearance kinetics. Mice were injected intraperitoneally with 
2 mg EdU and were bled on day 2-7 after EdU injection. Each mouse was bled only 
once to avoid the potential change in kinetics caused by bleeding. Cells were 
surface stained, fixed with 2% paraformaldehyde (PFA), and permeabilized with 
0.1% Triton-X. After permeabilization, EdU incorporation was detected by Click- 
iT EdU Alexa Fluor 647 Imaging Kit (Life Technologies) according to the man- 
ufacturer’s instructions. 

In vitro NET assay. Circulating neutrophils were harvested using Percoll Density 
Centrifugation Media (GE Healthcare) as previously described”’. In brief, blood 
was loaded onto a Percoll gradient consisting of 52%, 65% and 78% Percoll layers, 
and centrifuged at 2,500 r.p.m. for 30 min at room temperature. The cell bands 
between 65% and 78% layers were harvested, and RBCs were removed using RBC 
lysis buffer. Purity of 80-95% was constantly achieved with this method, as ana- 
lysed by flow cytometry. For ROS production assay, neutrophils were treated with 
20 pg ml * LPS (0111:B4, Sigma) for 30 min. ROS detection was performed using 
fluorogenic dye Dihydrorhodamine 123 (Life Technologies) as previously 
described**. For NET formation in vitro, neutrophils were attached to poly-L- 
lysine-coated slides and treated with 201g ml LPS (0111:B4, Sigma) for 3h. 
Following stimulation, cells were stained without fixation with SYTOX Orange 
(cell impermeable) and SYTO 13 (cell permeable) nucleic acid dyes (Life 
Technologies) to image DNA fibres and distinguish live and dead cells. After 
DNA staining, cells were fixed, permeabilized and blocked as previously 
described”. Cells were incubated with goat anti-neutrophil elastase (sc-9521, 
Santa Cruz Biotechnology) and rabbit anti-CitH3 (ab5103, Abcam) followed by 
Alexa Fluor 647 Donkey-anti-goat (Life Technologies) and Brilliant Violet 421 
donkey anti-rabbit (Biolegend) secondary antibodies. Species-matching isotype 
controls were used to confirm fluorescence staining. NETs were defined by DNA 


fibres co-localized with neutrophil elastase and CitH3 proteins with a length 
exceeding 40 jim, and quantified as the percentage of NETs among all neutrophils 
present in the field. 

In vivo NET assay. For analyses of NET formation in vivo, goat anti-neutrophil- 
elastase (sc-9521, Santa Cruz Biotechnology), and rabbit anti-CitH3 (ab5103, 
Abcam) were labelled by APEX Alexa Fluor 647 and APEX Alexa Fluor 568 
antibody labelling kit (Life Technologies), respectively, according to the manufac- 
turer’s instructions. Mice were injected intraperitoneally with 30 mgkg ~' LPS for 
3h, and then were injected intravenously with 5 ug Alexa Fluor 568-labelled anti- 
CitH3, 2 ug Alexa Fluor 647-labelled anti-NE, 10g Pacific Blue-labelled anti- 
CD31 (clone 390 or clone MEC13.3, Biolegend) and 10 14M SYTOX Green (cell 
impermeable) nucleic acid dye (Life Technologies). Species-matching isotype con- 
trols were also labelled using the same protocol and injected into septic mice to 
validate fluorescence staining. Fresh livers were obtained 20 min after antibody 
injections, and directly imaged using an Axio Examiner.D1 confocal microscope 
(Zeiss). Confocal microscopy provided a penetration of ~100 1m into the tissue. 
Liver vasculature was identified by CD31 staining. NETs were defined by extra- 
cellular DNA fibres stained by SYTOX Green, anti-CitH3 and anti-neutrophil- 
elastase antibodies with a length exceeding 40 um, and quantified as the average 
number of NETs in each vessel. 

Immunofluorescence. Tissues were embedded in Tissue-Plus O.C.T. Compound 
(Fisher HealthCare), and frozen 20-t1m thick sections were prepared using a 
CM3050 S cryostat (Leica). Sections were fixed with 4% PFA for 10 min, and 
blocked and permeabilized with PBS containing 20% species-matching serum 
and 0.5% Triton-X for 1-2 h. Adhesion molecules on endothelial cells were stained 
by PE-anti-P-selectin (clone Psel.KO2.3, eBioscience), PE-anti-E-selectin (clone 
10E9.6, BD Pharmingen), PE-anti-Icam1 (clone YN1/1.7.4, eBioscience), and PE- 
anti-VCAM-1 (clone 429, Biolegend). Expression of adhesion molecules was 
quantified using Slidebook software (Intelligent Imaging Innovations) as prev- 
iously described’. For immunofluorescence staining of neutrophils on sections, 
goat anti-neutrophil-elastase (sc-9521, Santa Cruz Biotechnology) and rabbit anti- 
CitH3 (ab5103, Abcam) were used, followed by Alexa Fluor 568 Donkey-anti-goat 
and Alexa Fluor 488 donkey anti-rabbit (Life Technologies) secondary antibodies. 
For analyses of fibrin deposition, sections were fixed using formalin containing 2% 
acetic acid for 30 min to remove soluble fibrinogen and leave only cross-linked 
fibrin in the tissue**. Sections were then blocked and permeabilized, and incubated 
with goat anti-fibrinogen-B (sc-18029, Santa Cruz Biotechnology) and PE-anti- 
Ly6G (clone 1A8, biolegend), followed by Alexa Fluor 568 donkey-anti-goat sec- 
ondary antibody (Life Technologies). In all immunofluorescence experiments, 
endothelial cells were identified by Alexa Fluor 647-anti-CD31 (MEC13.3, 
Biolegend) and nuclei were stained by Hoechst 33342 (Life Technologies). 
Histology analyses of septic mice. Mice were injected intraperitoneally with 
30mgkg * LPS, and livers were collected 24h after injection and fixed in 10% 
formalin. Histological analyses were performed by the Histology and Comparative 
Pathology Facility at Albert Einstein College of Medicine according to standard 
protocols. Survival time was defined as the time from LPS injection until death of 
the mouse to a maximum of 150h. 

Statistical analysis. No statistical methods were used to predetermine sample size. 
Experiments were performed blind to group allocations, and validated by two 
investigators independently. Paired and unpaired two-tailed Student’s t-tests 
and Mann-Whitney U-tests were used to compare two groups. One-way 
ANOVA analysis was used for multiple group comparisons. Log-rank test was 
used to compare survival curves. Statistical analyses were performed using Graph 
Pad Prism 6 software. 
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Extended Data Figure 1 | Phenotypic and functional characterization of 
aged neutrophils. a, Flow cytometry analysis of donor neutrophil ageing after 
adoptive transfer into recipients. Donor neutrophils gated by CD45.1* and 
aged neutrophils gated by CD62L'°CXCR4™. b, Ageing and clearance kinetics 
of donor neutrophils after adoptive transfer into recipients (n = 3 mice). Left 
y axis, donor neutrophil number relative to the initial number of neutrophils 
transferred (black dashed line); right y axis, percentage of the aged subset in 
donor neutrophils (red line). c, d, MFIM an ysis of Mac-1 activation of 
neutrophils harvested from wild-type or Selp’~ mice, labelled by PKH26 (red) 
and transferred into wild-type recipients. Scale bar, 10 jum. e, Plasma cytokine 
levels in wild-type and CD169-DTR mice 5 days after diphtheria toxin 
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treatment (n = 5 mice). f, Percentages of adherent neutrophils that capture 
more than eight beads in diphtheria-toxin-treated wild-type and CD169-DTR 
mice (n = 8 mice). g, h, Flow cytometry analysis of surface marker expression 
(g), cell size (FSC) and granularity (SSC; h; n = 7 mice) on CD62L" young and 
CD62L"° aged neutrophils. i, CXCR4 expression levels on CD62L™ young 
and CD62L" aged neutrophils in wild-type, Selp /~, and CD169-DTR mice 
(wild type, n = 13 mice; Selp ’ ~,n=4 mice; CD169-DTR, n =5 mice). Error 
bars, mean + s.e.m. *P< 0.05, **P < 0.01, ***P < 0.001, data representing 
two or more independent experiments analysed with one-way ANOVA (b) or 
unpaired Student’s t-test (e-i). 
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[= Fusobacteria/Fusobacteriia-Fusobacteriales/Fusobacteriaceae-Fusobacterium 0.007 + 0.003 0.640 + 0.392 0.182 
[24 Firmicutes/Negativicutes-Selenomonadales/Acidaminococcaceae-Acidaminococcus 0.002 + 0.002 0.753 + 0.555 0.247 
[1 Firmicutes/Clostridia-Clostridiales/unkown-unkown 0.098 + 0.037 0.362 + 0.222 0.302 
GB Firmicutes/Clostridia-Clostridiales/Ruminococcaceae-Ruminiclostridium 0.046 + 0.017 0.161 + 0.161 0.515 
@H Firmicutes/Clostridia-Clostridiales/Ruminococcaceae-Oscillospira 0.028 + 0.011 0.000 + 0.000 0.065 
GH Firmicutes/Clostridia-Clostridiales/Peptostreptococcaceae-Peptostreptococcus 0.004 + 0.002 0.180 + 0.180 0.384 
als 60 5 ©) Firmicutes/Clostridia-Clostridiales/Lachnospiraceae-unkown 0.854 + 0.284 0.161 + 0.161 0.076 
= @§ Firmicutes/Clostridia-Clostridiales/Lachnospiraceae-Moryella 0.004 + 0.003 1.495 + 0.636 0.079 
> 9 Firmicutes/Clostridia-Clostridiales/Lachnospiraceae-Lachnoclostridium 0.154 + 0.027 0.143 + 0.143, 0.941 
& GH Firmicutes/Clostridia-Clostridiales/Lachnospiraceae-Dorea 0.004 + 0.003 0.302 + 0.185. 0.183 
g * EB Firmicutes/Clostridia-Clostridiales/Lachnospiraceae-Coprococcus 0.041 + 0.009 0.000 + 0.000 0.012 
o GSH Firmicutes/Clostridia-Clostridiales/Lachnospiraceae-Butyrivibrio 0.400 + 0.227 0.482 + 0.199 0.793 
Ww 40 4 G8 Firmicutes/Clostridia-Clostridiales/Lachnospiraceae-Blautia 1.413 +0.215 1.027 + 0.448 0.468 
(1 Firmicutes/Clostridia-Clostridiales/Lachnospiraceae-[Ruminococcus] 0.004 + 0.003 1.087 + 0.475 0.085 
[1 Firmicutes/Clostridia-Clostridiales/Eubacteriaceae-Eubacterium 0.141 +0.019 5.736 + 2.586 0.097 
9 ‘Firmicutes/Clostridia-Clostridiales/Clostridiales-Finegoldia 0.004 + 0.002 1.026 + 0.483 0.102 
9 Firmicutes/Clostridia-Clostridiales/[Tissierellaceae]-WAL_1855D 0.033 + 0.011 3.805 + 1.452 0.06 
GH Firmicutes/Bacilli-Lactobacillales/Lactobacillaceae-Lactobacillus 0.066 + 0.041 15.478 + 6.598 0.08 
* MMM Deferribacteres/Deferribacteres-Deferribacterales/Deferribacteraceae-Mucispirillum 0.063 + 0.021 0.000 + 0.000 0.039 
20 4 * MME Bacteroidetes/Bacteroidia-Bacteroidales/Rikenellaceae-Alistipes 0.302 + 0.081 0.000 + 0.000 0.021 
* MEME Bacteroidetes/Bacteroidia-Bacteroidales/Prevotellaceae-Prevotella 0.192 + 0.052 24.135 + 3.406 0.002 
* MIME Bacteroidetes/Bacteroidia-Bacteroidales/Porphyromonadaceae-Porphyromonas 94.895+0.648 20.1914 1.744 1.47E-07 
@@ Bacteroidetes/Bacteroidia-Bacteroidales/Porphyromonadaceae-Odoribacter 0.000 + 0.000 0.323 + 0.323 0.374 
* MME Bacteroidetes/Bacteroidia-Bacteroidales/Bacteroidaceae-Bacteroides 0.763 + 0.157 13.989 + 0.624 1.27E-05 
™@ Actinobacteria/Actinobacteria-Coriobacteriales/Coriobacteriaceae-Atopobium 0.004 + 0.003 0.722 + 0.525, 0.243 
0o- Other 0.024 + 0.016 0.000 + 0.000 0.211 
Ctrl ABX Data are presented as mean + s.e.m. 
Extended Data Figure 2 | Antibiotic treatment efficiently depletes and total microbiome (n = 5 mice). Error bars, mean + s.e.m. *P < 0.05, 


alters the composition of the microbiota. a, Copy numbers of 16S ribosomal ***P< 0.001, data representing two or more independent experiments 
DNA in feces from control and antibiotics (ABX)-treated mice (n =5 mice). _ analysed with unpaired Student’s t-test (a, d) or permutational multivariate 
b, Principal component analysis of the microbiome composition in controland ANOVA (b). 

ABX-treated mice (n = 5 mice). c, d, Percentage of each bacteria genus in 
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Extended Data Figure 3 | Microbiota-derived molecules regulate neutrophil 
homeostasis and ageing. a, Numbers of circulating leukocyte subsets in 
control and antibiotics (ABX)-treated mice (n = 9 mice). b, Bone marrow 
cellularity and numbers of leukocyte subsets in the bone marrow of control and 
ABX-treated mice (n = 14 mice). c, Numbers of bone marrow haematopoietic 
stem and progenitor cells in control and ABX-treated mice (mn = 9 mice). 

d, Spleen cellularity and numbers of leukocyte subsets in the spleen of control 
and ABX-treated mice (n = 7 mice). e, Flow cytometry analysis of neutrophil- 


LPS interactions in blood, bone marrow (BM) and spleen 1h after LPS-FITC 
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gavage (Ctrl, n = 4 mice; LPS-FITC, n = 5 mice). Histogram showing 
fluorescence intensity on neutrophils gated by Gr-1"! CD115'° SSA. 

f, Numbers of circulating aged neutrophils in control, ABX-treated, and 
ABX-treated mice fed with peptidoglycan (PGN) or mTriDAP (left, n = 11 
(Ctrl), 9 (ABX), 9 (ABX+PGN) mice; right, n = 10 (Ctrl), 10 (ABX), 5 
(ABX+mTriDAP) mice). Error bars, mean + s.e.m. *P< 0.05, **P < 0.01, 
***P < 0.001, data representing two or more independent experiments 
analysed with unpaired Student’s t-test. 
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Extended Data Figure 4 | Neutrophil homeostasis is altered in germ-free 
mice. a, Total white blood cell (WBC) counts and numbers of leukocyte subsets 
in blood of specific-pathogen-free (SPF) and germ-free (GF) mice (n = 5 mice). 
b, Total bone marrow (BM) cellularity and numbers of leukocyte subsets in 
the bone marrow of SPF and germ-free mice (SPF, n = 5 mice; germ-free, n = 4 
mice). c, Total spleen cellularity and numbers of leukocyte subsets in the 
spleen of SPF and germ-free mice (SPF, n = 5 mice; germ-free, n = 4 mice). 
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d, Copy numbers of 16S ribosomal DNA in feces from SPF mice, germ-free 
mice, germ-free mice reconstituted by fecal transplantation (GF-FT), and 
antibiotic-treated germ-free mice (GF-ABX; n = 5, 5, 5 and 4 mice, 
respectively). e, Numbers of total circulating neutrophils in SPF, germ-free, GF- 
FT, and GF-ABX mice (n = 5, 5, 5 and 3 mice, respectively). Error bars, 
mean + s.em. *P < 0.05, **P < 0.01, ***P < 0.001, data representing two 

or more independent experiments analysed with unpaired Student’s f-tests. 
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Myd88 TLR4 TLR2 
Extended Data Figure 5 | Microbiota-driven neutrophil ageing is 
independent of clearance mechanisms, and mediated by TLRs and Myd88 
signalling. a, Adhesion molecule expression on endothelial cells (ECs) in 
control and antibiotics (ABX)-treated mice (n = 4 mice). MFI, mean 
fluorescence intensity. b, Numbers of spleen and liver macrophages in control 
and ABX-treated mice (left, n = 7 mice; right, n = 4 mice). c, d, Numbers of 
bone marrow (BM) macrophages (c; n = 19, 19, 10, 10 mice, left to right) and 
circulating aged neutrophils (d; n = 12, 11, 10, 9 mice, left to right) in 
diphtheria toxin (DT)-treated control, ABX-treated mice, CD169-DTR, and 
ABX-treated CD169-DTR mice. e, Flow cytometry analysis of aged neutrophils 
in wild-type and LysM-cre/Myd88"™" mice (n = 12, 10 mice, respectively). 


Myd88& TLR4 TLR2 


f, Percentages of aged neutrophils in wild-type, Tird “~ and Tir2-‘~ mice 

(n = 10, 10, 12 mice, respectively). g, Flow cytometry analysis of aged 
neutrophils in wild-type and Tuf “~ or Csf2-‘~ mice. h, Percentages of wild- 
type and LysM-cre/Myd88™" or Tir4~/~ or Tlr2~/~ neutrophils in total 
leukocyte population in chimaeric mice (n = 5 mice). i, Percentages of wild- 
type and LysM-cre/Myd88"™" or Tlr4~/~ or Tlr2~’~ neutrophils that capture 
more than eight beads in chimaeric mice (n = 5 mice). Error bars, 

mean + s.e.m. *P < 0.05, **P < 0.01, ***P < 0.001, data representing two or 
more independent experiments analysed with unpaired Student’s t-test (a-f) or 
paired Student’s t-test (h, i). 
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Extended Data Figure 6 | Microbiota depletion inhibits NET formation. (ABX)-treated mice, as analysed by immunofluorescence staining of DNA 
a, Flow cytometry analysis of aged neutrophils in isotype and anti-P/E-selectin _ (sytox orange), neutrophil elastase (NE) and citrullinated histone 3 (CitH3). 
antibody-treated mice (n = 6, 5 mice, respectively). b, ROS production of Inset, isotype control. Scale bars, 10 jim. d, Quantification of NET formation of 
neutrophils from isotype and anti-P/E-selectin antibody-treated mice, as neutrophils from isotype and anti-P/E-selectin antibody-treated mice, or 
analysed by flow cytometry using dihydrorhodamine 123 (DHR-123; Isotype, from control and ABX-treated mice (left, n = 4 (Isotype), 5 (Abs) mice; right, 
n= 10; Abs (P/E), n = 11 mice). Grey lines, background fluorescence of n= 4 mice). Error bars, mean + s.e.m. *P < 0.05, **P < 0.01, ***P < 0.001, 
neutrophils from both groups without LPS stimulation. ns, not significant. data representing two or more independent experiments analysed with 
c, LPS-induced NET formation of neutrophils from control and antibiotics unpaired Student’s t-test. 
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Extended Data Figure 7 | Microbiota depletion benefits endotoxin-induced 
septic shock. a, Representative images and quantification of in vivo NET 
formation in liver vasculature of control and antibiotics (ABX)-treated mice 
challenged with 30 mgkg —' LPS (n = 3, 4 mice, respectively). Scale bar, 10 jum. 
b, Quantification of NET biomarkers, plasma nucleosome and DNA, in septic 
control and ABX-treated animals (n = 4 mice). c, d, Representative images 
showing CitH3* neutrophil aggregates (c) and fibrin deposition associated 
with neutrophil aggregates (d) in septic liver of control and ABX-treated mice. 
Arrows, diffusive CitH3 and neutrophil elastase (NE) proteins. Insets, 
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isotype controls. Scale bars, 10 um. e, f, Numbers of CitH3* neutrophils and 
neutrophil aggregates (e; left: n = 4 mice; right: n = 40 vessels from 4 mice) and 
quantification of fibrin deposition (f; n = 4 (Ctrl), 3 (ABX) mice) in septic 
liver of control and ABX-treated mice. g, Survival time of control, ABX-treated 
mice, and ABX-treated mice infused with 2 X 10° aged or young neutrophils in 
septic shock induced by 30 mg kg ~! LPS (n = 16, 10, 13, 6 mice, respectively). 
Error bars, mean + s.e.m. *P < 0.05, **P < 0.01, data representing two or 
more independent experiments analysed with unpaired Student’s t-test 

(a, e (left), f), Mann-Whitney U-test (b, e (right)) or log-rank test (g). 
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a 8 b 
i) pate me SA 
Mice Venule Venular diameter Centerline velocity Shear rate Blood flow 
a n) in) im) (mm/s) (s) Lis) 
3 Ctrl 4 33 31.8407 1.46 +0.22 492483 730483 
a = ABX 4 38 30.0+0.6 1.92 + 0.27 680+92 881+ 138 
c= 
oo 
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2 So Mice Venule Venular diameter Centerline velocity Shear rate Blood flow 
=> n) n) m) (mmis} (st Lis} 
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: ‘ACS (past VOC ad Sa 
Patient ID inv Gender Splenectomy Hydroxyurea "7 saat ose omplications (past year) 
NM-035S N ™ N Y 0 1 ‘Obstructive sleep apnea 
NM-024S N F N Y o C) Flu 
NM-030S N F N Y ° ° 
NM-014S N M N ¥, o o Nocturnal hypoxemia, Obstructive sleep apnea 
NM-023S N F N N ° ° 
NM-027S N F N y, C) C) 
NM-018S N M N N ° C) 
NM-032S N F N Y 1 0 
NM-008S N M N ¥ 1 1 
NM-021S N F N Y o 2 
NM-001S N M N Y 1 4 
NM-003S N mM N Y 1 3 Nocturnal hypoxemia 
NM-015S N M N Y o C) 
NM-017S N F N ¥, C) C) 
NM-002S N M N Y NIA NIA CNS vasculopathy 
NM-009s N F N Y C) 2 
NM-006S N F N N 0 C) 
NM-011S N M N Y 1 1 
NM-016S N M N N 0 3 
NM-031S N F N Y 2 1 
NM-010S N M N ¥. 0 1 
NM-004S N M N Y C) C) 
NM-005S N M N ¥. 1 1 Pulmonary hypertension 
NM-013S Y M N Mm C) C) 
NM-033S Y M N Y Cy 0 
NM-019S Y F N Y 1 2 
NM-022S Y F N N ° C) Splenic sequestration 
NM-034S Y F N N ° 0 
NM-007S Y mM N Y 1 1 Asthma, allergies 
NM-025S ¥ F N Y ° C) 
NM-020S Y F Y ¥ 1 1 
NM-026S Y M N N C) 0 
NM-012S Y mM N Y C) 1 
NM-028S. y M y y 0 0 Delayed hemolytic transfusion reaction 


ACS, acute chest syndrome; VOC, vaso-occlusive crisis. 
All patients recruited do not have infection, VOC or other complications for at least 2 weeks prior to the study. 
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Extended Data Figure 8 | Microbiota depletion affects disease progression 
in sickle-cell disease. a, Numbers of circulating leukocyte subsets in 
hemizygous control (SA), control SCD (SS Ctrl) and antibiotics-treated SCD 
(SS ABX) mice (SA: n = 8 mice; SS Ctrl: n = 9 mice; SS ABX: n = 9 mice). 

b, Haemodynamic parameters of mice analysed for neutrophil adhesion and 
integrin activation. c, Percentages of adherent neutrophils that capture more 
than eight beads in SA, SS Ctrl and SS ABX mice (n = 4, 3, 3 mice, respectively). 
d, Correlation between the survival times of SS control and SS ABX mice in 
acute vaso-occlusive crisis and their spleen weights. R” = 0.45. e, Scoring ofliver 
damage, liver fibrosis, inflammation and necrosis in SS control and SS ABX 
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mice (n = 8, 9 mice, respectively). f, Flow cytometry analysis of aged 
neutrophils in healthy controls, SCD patients (SS), and SCD patients on 
penicillin V prophylaxis (SS-PV). g, Demographics of human subjects analysed 
for aged neutrophil numbers. ACS, acute chest syndrome; VOC, vaso-occlusive 
crisis. h, Aged neutrophil numbers in SCD patients grouped by age, gender, 
hydroxyurea (HU) and penicillin V (Pen V) treatment (Ctrl, n = 9 subjects; 
SS, n = 23 subjects; SS-PV, n = 11 subjects). Error bars, mean + s.e.m. 

*P < 0.05, **P < 0.01, ***P < 0.001, data representing two or more 
independent experiments analysed with unpaired Student's t-test (a, c, h) or 
Mann-Whitney U-test (e). 
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Extended Data Table 1 | Pathways selected for the analysis of neutrophil functions 


Pathwa Gene set name 

Innate immune system REACTOME_INNATE_IMMUNE_SYSTEM 

Local acute inflammatory response BIOCARTA_LAIR_PATHWAY 

Leukocyte adhesion and diapedesis BIOCARTA_LYM_PATHWAY 

MAPK signaling for integrins REACTOME_P130CAS_LINKAGE_TO_MAPK_SIGNALING_FOR_INTEGRINS 

Cell surface integrin pathway PID_INTEGRIN_CS_PATHWAY 

Fce-gamma receptor mediated phagocytosis KEGG_FC_GAMMA_R_MEDIATED_PHAGOCYTOSIS 

Lymphoid/Non-lymphoid interactions REACTOME_IMMUNOREGULATORY_INTERACTIONS_BETWEEN_A_LYMPHOID_AND_A_NON_LYMPHOID_CELL 
ROS pathway BIOCARTA_FREE_PATHWAY 

Peroxisome KEGG_PEROXISOME 

Cell/ECM interactions REACTOME_CELL_EXTRACELLULAR_MATRIX_INTERACTIONS 

Degradation of ECM REACTOME_DEGRADATION_OF_THE_EXTRACELLULAR_MATRIX 

Complement and coagulation cascades KEGG_COMPLEMENT_AND_COAGULATION_CASCADES 

TLR signaling pathway KEGG_TOLL_LIKE_RECEPTOR_SIGNALING_PATHWAY 

TLR endogenous pathway PID_TOLL_ENDOGENOUS_PATHWAY 

TLR4 signaling REACTOME_ACTIVATED_TLR4_SIGNALLING 

NLR signaling pathway KEGG_NOD_LIKE_RECEPTOR_SIGNALING_PATHWAY 

NLR pathway REACTOME_NUCLEOTIDE_BINDING_DOMAIN_LEUCINE_RICH_REPEAT_CONTAINING_RECEPTOR_NLR_SIGNALING_PATHWAYS 
NOD1/2 signaling pathway REACTOME_NOD1_2_SIGNALING_PATHWAY 

RLR pathway KEGG_RIG_I_LIKE_RECEPTOR_SIGNALING_PATHWAY 

RIG | mediated induction of IFN alpha REACTOME_RIG_I_MDAS_MEDIATED_INDUCTION_OF_IFN_ALPHA_BETA_PATHWAYS 

Activation of NFKB REACTOME_ACTIVATION_OF_NF_KAPPAB_IN_B_CELLS 

NFKB canonical pathway PID_NFKAPPABCANONICALPATHWAY 

NFKB atypical pathway PID_NFKAPPABATYPICALPATHWAY 

NFKB activation by RIP1 REACTOME_IKK_COMPLEX_RECRUITMENT_MEDIATED_BY_RIP1 

NFKB activation through FADD/RIP1 pathway REACTOME_NFKB_ACTIVATION_THROUGH_FADD_RIP1_PATHWAY_MEDIATED_BY_CASPASE_8_AND10 
NFKB activation by TAK1 REACTOME_TAK1_ACTIVATES_NFKB_BY_PHOSPHORYLATION_AND_ACTIVATION_OF_IKKS_COMPLEX 
NFKB and MAPK activation mediated by TLR4 REACTOME_NFKB_AND_MAP_KINASES_ACTIVATION_MEDIATED_BY_TLR4_SIGNALING_REPERTOIRE 
NFKB activation induced by TRAF6 REACTOME_TRAF6_MEDIATED_INDUCTION_OF_NFKB_AND_MAP_KINASES_UPON_TLR7_8_OR_9_ACTIVATION 
Chemokine signaling pathway KEGG_CHEMOKINE_SIGNALING_PATHWAY 

CXCR2 pathway PID_IL8CXCR2_PATHWAY 

Cytokine signaling pathway REACTOME_CYTOKINE_SIGNALING_IN_IMMUNE_SYSTEM 

Cytokines BIOCARTA_CYTOKINE_PATHWAY 

L1 pathway REACTOME_IL1_SIGNALING 

L2 receptor beta pathway BIOCARTA_IL2RB_PATHWAY 

L4 pathway ST_INTERLEUKIN_4_ PATHWAY 

L6 pathway EACTOME_IL_6_SIGNALING 

L12 pathway D_IL12_2PATHWAY 

L23 pathway D_IL23PATHWAY 

L27 pathway D_IL27PATHWAY 


FN-alpha signaling 
FN-gamma pathway 
TNFR1 pathway 
TNF/p75/NTR signaling 
TNFR2 pathway 

PPAR signaling pathway 
p38/MAPK pathway 
ERK pathway 

PIP3 signaling pathway 
Ras signaling pathway 
RhoA pathway 

HDAC Class | pathway 
G-alpha | signaling pathway 
NFAT pathway 

PDGF pathway 

Rac1 pathway 


EACTOME_REGULATION_OF_IFNA_SIGNALING 
EACTOME_INTERFERON_GAMMA_SIGNALING 
OCARTA_TNFR1_PATHWAY 
D_P75NTRPATHWAY 
OCARTA_TNFR2_PATHWAY 
EGG_PPAR_SIGNALING_PATHWAY 
EACTOME_P38MAPK_EVENTS 
EACTOME_SIGNALLING_TO_ERKS 
G_PIP3_SIGNALING_IN_CARDIAC_MYOCTES 
EACTOME_SIGNALLING_TO_RAS 
D_RHOA_PATHWAY 
D_HDAC_CLASSI_PATHWAY 
EACTOME_G_ALPHA_|_SIGNALLING_EVENTS 
D_NFAT_TFPATHWAY 
OCARTA_PDGF_PATHWAY 
D_RAC1_PATHWAY 


DWDOWOWUVUAVA:VDVUARADAADAATVAADAVDDVBWGA 


AKT pathway OCARTA_AKT_PATHWAY 

MAPK pathway OCARTA_MAPK_PATHWAY 

HIF pathway OCARTA_HIF_PATHWAY 

Purinergic receptor signaling pathway REACTOME_NUCLEOTIDE_LIKE_PURINERGIC_RECEPTORS 

Purinergic receptor P2Y signaling pathway REACTOME_P2Y_RECEPTORS 

Translation REACTOME_TRANSLATION 

Ribosome KEGG_RIBOSOME 

Ae Ribocorhe REACTOME_ACTIVATION_OF_THE_MRNA_UPON_BINDING_OF_THE_CAP_BINDING_COMPLEX_AND_EIFS_AND_SUBSEQUENT_BI 
NDING_TO_43S 

EIF pathway BIOCARTA_EIF_PATHWAY 

Protein export KEGG_PROTEIN_EXPORT 

Amino acid degradation REACTOME_METABOLISM_OF_AMINO_ACIDS_AND_DERIVATIVES 

Proteosome KEGG_PROTEASOME 

Ubiquitin mediated proteolysis KEGG_UBIQUITIN_MEDIATED_PROTEOLYSIS 

Protein degradation-Parkin pathway BIOCARTA_PARKIN_PATHWAY 

Cell death BIOCARTA_DEATH_PATHWAY 

Caspase cascades SA_CASPASE_CASCADE 

Cell death mitochondria pathway BIOCARTA_MITOCHONDRIA_PATHWAY 


Gene set name refers to Molecular Signatures Database v5.0 (Broad Institute). 
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Extended Data Table 2 | Gene set enrichment analysis of selected pathways in aged and activated neutrophils 


Categor Pathwa Aged NES Aged p-val Activated NES Activated p-val 
Immune Functions Innate immune system 1.2282217 0.10245901 2.1779234 10) 
Local acute inflammatory response 1.1145728 0.3392857 1.8885438 0.001930502 
Leukocyte adhesion and diapedesis 1.5105772 0.06352941 1.6552799 0.025242718 
MAPK signaling for integrins 1.4485482 0.061440676 1.6242106 0.032128513 
Cell surface integrin pathway 1.4916337 0.0675 -1.2022164 0.24557522 
Fc-gamma receptor mediated phagocytosis 0.8664476 0.6876877 1.61418 0.008116883 
Lymphoid/Non-lymphoid interactions 0.8654324 0.6196172 1.5619569 0.033557046 
ROS pathway -0.95446336 0.5321429 1.7730503 0.005565863 
Peroxisome -1.203138 0.20919882 1.6480244 0.003395586 
Cell/ECM interactions -0.7654254 0.7643979 1.772366 0.006048387 
Degradation of ECM -1.4378743 0.095914744 1.5911175 0.028462999 
Complement and coagulation cascades -1.5144768 0.070853464 1.6505035 0.030303031 
Immune Functions TLR signaling pathway 1.4612477 0.026706232 2.3468528 0 
PRR and NFKB Signaling TLR endogenous pathway 1.3665496 0.1056338 1.5889539 0.035714287 
TLR4 signaling 1.0985918 0.31333333 1.9638056 0 
NLR signaling pathway 1.4583313 0.05479452 2.5441191 0 
NLR pathway 1.4290915 0.07730673 2.1706495 0 
NOD1/2 signaling pathway 1.3088835 0.13197969 2.2286413 0 
RLR pathway -0.8638851 0.6666667 2.1639493 0 
RIG | mediated induction of IFN alpha -0.8472993 0.7259843 2.0400646 0 
Activation of NFKB 1.2855691 0.11396012 1.8813537 0.001718213 
NFKB canonical pathway 1.3068268 0.1495098 2.174795 0 
NFKB atypical pathway 0.8993623 0.6034483 1.9205661 0.001949318 
NFKB activation by RIP1 1.7887179 0.002192983 1.8208189 0.001930502 
NFKB activation through FADD/RIP1 pathway 1.8454865 0.008928572 1.6110392 0.033898305 
NFKB activation by TAK1 1.2061867 0.22624435 2.0932863 0 
NFKB and MAPK activation mediated by TLR4 1.1569227 0.22841226 1.8594357 0 
NFKB activation induced by TRAF6 1.0924268 0.27714285 1.6228119 0.010016695 
Immune Functions Chemokine signaling pathway -0.7447356 0.8922652 1.433992 0.028616853 
Cytokine and Chemokine CXCR2 pathway -0.9767436 0.5085324 1.5124553 0.050583657 
Cytokine signaling pathway -1.1273036 0.25921053 2.1093962 0 
Cytokines -0.9237167 0.5686275 1.5562985 0.039325844 
IL1 pathway 1.6451061 0.02925532 2.078718 0 
IL2 receptor beta pathway -2.0031374 0 1.2259008 0.2228261 
IL4 pathway -1.312908 0.16470589 1.5741479 0.03169014 
IL6 pathway -0.8152665 0.7010676 1.5760885 0.046747968 
IL12 pathway -1.6290972 0.014240506 1.9707948 0.001675042 
IL23 pathway -1.2130085 0.24518389 1.9758543 0 
IL27 pathway -1.3348651 0.16555184 1.7636672 0.005464481 
IFN-alpha signaling -0.8072854 0.68761224 1.6680452 0.014705882 
IFN-gamma pathway 1.4466043 0.087804876 2.066193 0 
TNFR1 pathway 1.2722206 0.16010499 1.6132443 0.032432433 
TNF/p75/NTR signaling 1.3214921 0.08732394 1.726003 0.003231018 
TNFR2 pathway -0.8648928 0.6243386 1.9728887 0 
Signal Pathways PPAR signaling pathway -0.8915187 0.6086956 2.0968451 0 
p38/MAPK pathway -1.0720062 0.37918216 1.9020989 0 
ERK pathway -1.1935399 0.24440895 1.4667499 0.04886562 
PIP3 signaling pathway -1.4735942 0.05496183 1.5826265 0.022452503 
Ras signaling pathway -1.6251277 0.02680067 1.4745739 0.07090909 
RhoA pathway -1.6779591 0.017133957 1.1863663 0.24090122 
HDAC Class | pathway -1.7304282 0.00608828 1.3537472 0.09764919 
G-alpha | signaling pathway -0.8018528 0.78581977 1.4290625 0.03891709 
NFAT pathway -1.5555012 0.039087947 -1.6272985 0.027272727 
PDGF pathway -1.6846628 0.01983471 -1.7379107 0.014354067 
Rac1 pathway 0.64468014 0.91351354 1.5256119 0.033043478 
AKT pathway 0.95451254 0.56 1.5437053 0.05009634 
MAPK pathway 1.1132134 0.25617284 1.637257 0.003289474 
HIF pathway 1.6175048 0.03117506 1.2415975 0.24675325 
Purinergic receptor signaling pathway 1.5389258 0.06153846 1.6881694 0.015968064 
Purinergic receptor P2Y signaling pathway 1.5777047 0.036876354 1.4960046 0.0503876 
Cellular Functions Translation 1.4605397 0.028673835 -1.8208151 0 
Ribosome 1.1336436 0.23906706 -1.8144947 0 
43S Ribosome 1.1219336 0.28531855 -1.8783303 0 
EIF pathway 1.2240345 0.23933649 -1.5321815 0.05 
Protein export 1.5375326 0.06122449 -0.91629106 0.5726496 
Amino acid degradation -1.5101556 0.018518519 1.3439316 0.05457464 
Proteosome -1.3857354 0.06864274 1.4372953 0.06989247 
Ubiquitin mediated proteolysis -0.926711 0.593361 1.7934686 0 
Protein degradation-Parkin pathway -0.7039581 0.83516484 1.8072739 0.003731343 
Cell death 1.5957102 0.035128806 2.0140357 0 
Caspase cascades 1.5200465 0.065853655 1.9330438 0.001801802 
Cell death mitochondria pathway 1.6465071 0.029268293 1.6357895 0.016393442 


NES, normalized enrichment score; p-val, nominal p value. 
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Somaclonal variation arises in plants and animals when differen- 
tiated somatic cells are induced into a pluripotent state, but the 
resulting clones differ from each other and from their parents. In 
agriculture, somaclonal variation has hindered the micropropaga- 
tion of elite hybrids and genetically modified crops, but the mech- 
anism responsible remains unknown’. The oil palm fruit ‘mantled’ 
abnormality is a somaclonal variant arising from tissue culture 
that drastically reduces yield, and has largely halted efforts to clone 
elite hybrids for oil production’~*. Widely regarded as an epi- 
genetic phenomenon’, ‘mantling’ has defied explanation, but here 
we identify the MANTLED locus using epigenome-wide asso- 
ciation studies of the African oil palm Elaeis guineensis. DNA 
hypomethylation of a LINE retrotransposon related to rice 
Karma, in the intron of the homeotic gene DEFICIENS, is common 


ome 
TAS 


Figure 1 | Epigenome-wide association study (EWAS). a-c, Normal 

(a), fertile mantled (b) and parthenocarpic mantled (c) fruit shown as whole 
fruit (top), longitudinal sectioned (middle) and cross sectioned (bottom). Black 
arrows denote pseudocarpels; white arrows denote kernel. d, Circos plot of oil 
palm chromosomes. Track order: gene density (i); repeat density (ii); cytosine 
methylation density (whole-genome bisulfite sequencing) in ortet (iii); cytosine 
methylation densities (microarray) of ortet (iv), normal ramet (v) and mantled 
ramet (vi); differential cytosine methylation of normal minus mantled ramets 
(vii). Heatmaps represent average cytosine methylation densities in ~300-kb 


to all mantled clones and is associated with alternative splicing and 
premature termination. Dense methylation near the Karma splice 
site (termed the Good Karma epiallele) predicts normal fruit set, 
whereas hypomethylation (the Bad Karma epiallele) predicts 
homeotic transformation, parthenocarpy and marked loss of yield. 
Loss of Karma methylation and of small RNA in tissue culture 
contributes to the origin of mantled, while restoration in spontan- 
eous revertants accounts for non-Mendelian inheritance. The abil- 
ity to predict and cull mantling at the plantlet stage will facilitate 
the introduction of higher performing clones and optimize envir- 
onmentally sensitive land resources. 

The African oil palm (E. guineensis) is the most efficient oil-bearing 
crop, but demand for edible oils and biofuels, combined with sustain- 
ability concerns over dwindling rainforest reserves, has led to intense 


windows independent of sequence context. e, Venn diagram of microarray 
features differentially methylated between leaves from mantled and normal 
ramets (P < 0.05, two-sided Student t-test, Methods). Each set represents clonal 
lineages of given genotypes obtained from one source: source A (red, 

15 mantled, 15 normal), source B (brown, 6 mantled, 14 normal), source C 
(blue, 14 mantled, 15 normal) and source D (green, 8 mantled, 10 normal). Red 
numbers indicate subsets including one of the four microarray features 
mapping to the Karma LINE element. 
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pressure to improve oil palm yield. Introduction of the tenera hybrid 
(dura X pisifera) increased oil yield by up to 30%, leveraging the 
SHELL gene that confers single gene heterosis®’. Clones (ramets) of 
individual high-yielding tenera hybrid palms (ortets) provide a power- 
ful shortcut to yield enhancement, with an additional 20-30% 
improvement®. Micropropagation through cell culture of immature 
apex leaf tissue (the ‘heart of palm’), and plantlet regeneration on 
hormone-supplemented media (Methods), yields tens of thousands 
of genetically identical clonal palms. Unfortunately, shortly after the 
procedure was established, Tan Yap Pau of United Plantations, 
Malaysia, first noted a high frequency of homeotic floral phenotypes 
known as “mantling” among clonal ramets (T.Y.P., personal commun- 
ication). Subsequently, Corley et al.* documented the occurrence of 
mantled palms after prolonged periods in culture. In mantled palms, 
staminodes of pistillate flowers and stamens of staminate flowers 
develop as pseudocarpels®, often resulting in sterile parthenocarpic 
flowers with abortive fruit and very low oil yields (Fig. la-c and 
Extended Data Fig. 1). Pollination of mantled palms gave rise to vari- 
able numbers of mantled progeny, resembling rare naturally mantled 
variants known as poissoni, or diwakkawakka fruit forms**. The trait is 
non-Mendelian and sometimes reverts to normal? and so has long 
been considered epigenetic’, with an overall decrease in DNA methy- 
lation found in mantled ramets*'®. The homeotic transformations 
observed in mantled palms resemble defects in B-function MADS- 
box genes, suggesting strong candidates for epigenetic modification®. 
However, decades of research into candidate retroelements''’* and 
candidate homeotic genes*!*!’ have failed to identify epigenetic 
changes consistently found in somaclonal mantled palms. 


We performed a genome-wide, unbiased, DNA methylation analysis 
(an epigenome-wide association study; EWAS) in search of loci epigen- 
etically associated with the mantled phenotype, using a DNA microar- 
ray based on the E. guineensis (pisifera) reference genome" (Methods). 
DNA methylation density was measured in 1-2-kilobase (kb) intervals 
surrounding each feature by DNA-methylation-dependent comparative 
microarray hybridization’’ and statistical analyses (Methods). Genome- 
wide DNA methylation maps were constructed from parthenocarpic 
mantled (mn = 43) or normal (n = 54) ramets, as well as ortets from 
which these ramets were derived (n= 10). These maps strongly 
resembled those constructed by whole-genome bisulfite sequencing in 
sample palms (Fig. 1d), demonstrating reproducibility. 

At genome-wide resolution, the landscape of DNA methylation was 
remarkably consistent between ortets and ramets (Fig. 1d), with high- 
est methylation within repetitive sequences'*. However, thousands of 
loci were differentially methylated (Fig. 1d), most of which (~90%) 
were hypomethylated in mantled, consistent with previously reported 
reduced 5mC content*"’. Most hypomethylated loci (~75%) were 
transposons and repeats, while less frequent hypermethylated loci 
included genic sequences (Extended Data Fig. 2), resembling cell cul- 
tures of Arabidopsis’®. Fifteen independent somaclonal lineages 
obtained from four independent sources were used to maximize geno- 
typic diversity, and significant differentially methylated regions 
(DMRs) between normal and fully mantled samples were first iden- 
tified within each source population (Methods). Results were then 
compared between populations (Fig. le). Although tens of thousands 
of DMRs were detected between mantled and normal clones in each 
population, 99.9% of these were exclusive to either one (94.4%) or two 
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Figure 2 | Hypomethylation of Karma is associated with the mantled 
phenotype. a, Microarray feature data plotted below a map of the EgDEF1 
gene (vertical ticks, exons; horizontal line, introns; arrow, direction of 
transcription) including locations of Rider, Karma (dashed box) and Koala 
retrotransposons. CG and CHG sites are shown at the top. log;9 P values (54 
normal versus 43 parthenocarpic mantled ramets) are plotted (two-sided 
Student’s t-test). Arrow in P value plot denotes feature detected as hypomethy 
lated in mantled ramets from all four sources (Fig. le). b, Genome-wide bisulfite 
sequencing of leaf samples from ortet (O, black, n = 5), normal ramets (N, green, 
n= 5) and parthenocarpic mantled ramets (M, red, n = 5). Mean methylation 
density per cytosine is plotted on a 0-100% scale for each cytosine context and 
sample type. CHG DMR, differentially CHG methylated region corresponding to 
Karma. c, CHG methylation monitored in 86 additional ortets, mantled and 
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normal ramet leaf samples by restriction enzyme digestion and qPCR (Methods). 
Linear discriminant analysis was performed between normal (n = 21) and 
mantled (n = 28) samples with BbvI and Rsal restriction sites. FN1 and FN2, two 
false-negative mantled samples. Green and red arrows denote normal and 
mantled control samples, respectively. A similar analysis was performed on 
remaining normal (n = 14) and mantled (n = 23) samples with ScrFI restriction 
sites (Extended Data Fig. 4d). d-g, Karma bisulfite sequencing maps (antisense 
strand) of normal control (d), mantled control (e), FN1(f) and FN2 (g). Thirteen 
CHG sites are shown to scale above. ‘S’ denotes CHG at the Karma splice acceptor 
site (CAG/CTG)); ‘B’ denotes the BbvI site. Bar, CHGs within the common 
microarray feature (Fig. le). Methylated and unmethylated CHG sites are 
indicated by green and red boxes, respectively. Open boxes denote low-quality 
base calls. Each row represents an individual Sanger DNA sequencing read. 
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Figure 3 | Karma methylation in revertant palms. a-c, Spikelet from a 
revertant ramet (a) including normal (b) and fertile (c) mantled fruit with one 
or two pseudocarpels (arrows). d, Density of CHG methylation (percentage 
mCHG) at the BbvI site (Methods) in ramets yielding 100% normal fruit (n.f.) 
(green), revertant ramets yielding 99% (yellow) or 95% (orange) normal fruit 
anda mosaic ramet yielding 7% (red) normal fruit per bunch. Error bars denote 
s.d. (biological replicates of fronds (n = 4), rachis sections (n = 8) or fruit 

(n = 2)). e, f, Percentage mCHG for the three CHG sites found in the unique 
common microarray feature in normal (green) and subtly mantled (red) 

fruit from revertant ramets yielding 99% (e) or 95% (f) normal fruit per bunch 
(two-tailed Fisher’s exact test; NS, not significant). Alleles were analysed 
separately based on a heterozygous single nucleotide polymorphism (SNP) 
within the bisulfite sequencing amplicon. 


(5.5%) of the four populations, indicating considerable genotypic vari- 
ation in epigenetic response to tissue culture. A single microarray 
feature detected differential methylation between normal and mantled 
clones in all four populations (Fig. le). This feature lies within the 
~35 kb intron 5 of EgDEFI (Fig. 2a), the oil palm orthologue of the 
B-class MADS-box transcription factor genes, Antirrhinum majus 
DEFICIENS (DEF) and Arabidopsis APETALA3 (AP3)**, 

Elaeis guineensis DEF 1 spans ~40 kb on chromosome 12 (Fig. 2a). A 
Ty1/copia Rider retrotransposon lies upstream, while a Ty3/gypsy ret- 
rotransposon, Koala, is located within intron 5. Consistent with 
important earlier work’”, no DNA methylation difference within these 
retrotransposons was found in mantled clones across several popula- 
tions (Fig. 2a and Extended Data Fig. 3). However, a third previously 
unreported repetitive element lies within intron 5, and has homology 
to rice Karma LINE elements. Karma is activated in rice embryogenic 
tissue culture, but only transposes in regenerated plants as trans- 
generational DNA hypomethylation of the element persists’’. The 
3.2-kb oil palm Karma element is flanked by a 13-base-pair (bp) target 
site duplication (TTCAAAATGATGA) and includes a defective 
reverse transcriptase open reading frame (ORF2) preceded by a splice 
acceptor (“’) and followed by a polyadenylation signal, resembling 
truncated Karma elements in rice’”'* (Supplementary Fig. 1). The 
unique microarray feature, which consistently detected hypomethyla- 
tion in mantled clones, serendipitously includes the predicted splice 
acceptor site (GAACAGAATGC). All three additional microarray 
features mapping within the Karma element also detected significant 


LETTER 


a Coordinate 
920,000 925,000 930,000 945,000 950,000 955,000 
cDEF1 
t la Hl 
+ KDEF 1 
ll tDEF1 
125 
b —— cDEF1 normal 
ec 1.07 
re} — cDEF7 mantled 
n 
an 4 
p> te —— kDEF1 normal 
5 064 
. : —— kDEF1 mantled 
= 
Ss 0.47 —— tDEF1 normal 
cc 
0.27 —— tDEF1 mantled 
o4 


Normal 
t=] 
= 


0 2621S 


Mantled 


Normal 


@unyno enssi,. 


Mantled 


00 
920,000 922,000 924,000 


Coordinate 


926,000 928,000 930,000 


Figure 4 | Alternative splicing and loss of 24-nucleotide siRNA. a, EgDEF1 
transcripts assembled from transcriptome sequencing (data not shown) and 
RT-PCR (Methods). Black boxes denote exons; blue box denotes intron 5 
sequence included in the tDEF1 transcript. Coordinates relative to the reference 
pisifera oil palm genome". b, Quantitative RT-PCR of cDEF1, tDEF1 and 
kDEFI transcripts in shoot apices (stage 0) and in early (stage 2) to late (stage 5) 
female inflorescences from normal and parthenocarpic mantled ramets. Error 
bars, s.d. between three replicate assays of three replicate tissue samples per 
phenotype, per stage. Expression relative to an endogenous reference gene is 
shown (Methods). ¢, 24-nucleotide siRNA accumulation in shoot apices (stage 
0) from normal (n = 5) and parthenocarpic mantled (n = 7) ramets, and from 
second passage apical leaf tissue cultures re-cloned from normal (n = 2) or 
mantled (n = 1) ramets (Methods). Values expressed as fragments per kilobase 
per million mapped reads (FPKM). Bars above (sense) and below (antisense) 
the line indicate mapped normalized 24-nucleotide siRNAs that are not 
significantly different in abundance in normal and mantled (grey) or 
significantly differentially expressed in normal (green) relative to mantled (red) 
(P < 0.05, Student’s t-test, two tailed, assuming equal variance). 


hypomethylation in mantled clones (Fig. 2a and Extended Data Figs 3 
and 4a-c). 

To verify Karma hypomethylation, sample trios comprising gen- 
etically identical ortet, parthenocarpic mantled and normal ramets 
from five independent clonal lineages were subjected to whole-genome 
bisulfite sequencing (Methods). CG methylation was unchanged 
across the EgDEF1 locus, but Karma CHG methylation (H = A, C 
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or T) was markedly reduced in mantled clones, revealing a DMR 
covering ~70 CHG sites. CHH methylation was much lower and only 
subtly reduced (Fig. 2b). To validate differential CHG methylation in 
unrelated clonal palms, quantitative PCR (qPCR) assays were used to 
quantify CHG methylation at BbvI and Rsal restriction sites within the 
DMR (Methods, Fig. 2c and Extended Data Fig. 4b) in a panel of 
49 palms from 21 clonal lineages and 4 independent sources: 8 ortets 
and 13normal clones, 19 parthenocarpic mantled clones, 2 fertile 
mantled clones and 7 partially revertant clones yielding bunches with 
both mantled and normal fruit. Linear discriminant analysis provided 
93% sensitivity and 100% specificity for detection of mantling (Fig. 2c). 
Fronds from all seven of the revertant palms were scored as mantled, 
consistent with the observation that normal bunches arose late in 
development’. Similar results were obtained in 37 polymorphic palms 
using alternative restriction sites (Extended Data Fig. 4d). The two 
false-negative mantled palms (Fig. 2c) were further analysed by bisul- 
fite sequencing of a region spanning the Karma splice acceptor site. 
While normal clones had dense CHG methylation, and mantled con- 
trols had lost all CHG methylation (Fig. 2d, e), the false-negative 
mantled samples lost CHG methylation near the splice acceptor site 
(Fig. 2f, g and Extended Data Fig. 4e), which was therefore sufficient to 
predict the mantled phenotype. Because of their strong predictive 
properties, we named the mantled hyper- and hypomethylated epi- 
alleles Good Karma and Bad Karma, respectively. 

Two lineages of revertant palms had mixed bunches with both 
normal and mantled fruit’, resembling epialleles in maize regulated 
by transposons”. The first lineage included two revertant ramets with 
99% and 95% normal fruit per bunch, respectively, in which abnormal 
fruit had only one or two small pseudocarpels (Fig. 3a—c). A second 
lineage included a mosaic ramet with only 7% normal fruit. In all three 
ramets, CHG methylation at the BbvI site was low in fronds (Fig. 3d), 
consistent with other revertants (Fig. 2c). However, methylation was 
restored in fruit from the two revertant ramets, but not from the 
mantled mosaic ramet (Fig. 3d-f). As with similar epialleles in maize, 
Linaria and other plants'**', reversion of the abnormal phenotype 
accompanied by restoration of DNA methylation is strong evidence 
that Karma hypomethylation is the cause of the mantled phenotype. 
Differential methylation between individual mantled and normal fruit 
was not observed, however, probably reflecting non-cell autonomy of 
the B-class homeotic phenotype (Fig. 3a—d), also observed in 
Antirrhinum and Arabidopsis”. Bisulfite sequencing from normal 
and mantled fruit (Extended Data Fig. 5) revealed hyper- and hypo- 
methylated reads at the splice acceptor site (Fig. 3e, f), suggesting that 
these fruit were indeed mosaic for hyper- and hypomethylated cells. In 
one mosaic palm, direct evidence for mosaicism was obtained from 
different samples of the same vegetative frond (Extended Data Fig. 6). 

DNA methylation near splice acceptor sites affects alternative splic- 
ing, although the mechanism remains unclear”. To assess alternative 
splicing, EgDEF1 transcript models were built on the basis of tran- 
scriptome sequencing (data not shown), and were validated by reverse 
transcription PCR (RT-PCR; Methods). As previously reported”’, two 
forms of EgDEF! transcripts were found in both normal and mantled 
inflorescences: the full-length EgDEF1 transcript (cDEF1), and a pre- 
maturely terminated transcript including exons 1-5 and 221 bp of 
intron 5 (tDEF1)"* (Fig. 4a and Extended Data Fig. 4c). However, we 
identified a third alternative transcript in mantled female inflores- 
cences (Fig. 4a). This novel transcript (KDEF1) was spliced from the 
donor site of intron 5 to the proximal Karma acceptor site, and is 
predicted to encode a truncated EgDEF1 peptide that terminates 
within the K domain of the MADS-box protein, and has a unique 
carboxy-terminal sequence (Extended Data Fig. 7a). 

Expression of the three transcripts (Fig. 4b) was assessed in shoot 
apical meristem (stage 0) and four stages of immature female inflor- 
escence development (Methods and Extended Data Fig. 7b-f). KDEF1 
expression was notably restricted to stages 3-5 of mantled (but not 
normal) female inflorescence, strongly suggestive of a role in the 
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mantled phenotype. By contrast, cDEF1 was detected at only slightly 
(although significantly) lower levels in mantled female inflorescences”, 
while tDEF1 was unchanged (Fig. 4b). Previously reported differences in 
the timing of t(DEF1 and cDEF 1 transcription within each stage’* may be 
the consequences of KDEF1 transcription (Methods). 

In plants, 24-nucleotide small interfering RNAs (siRNAs) guide non- 
CG methylation, and we identified a cluster of antisense 
24-nucleotide Karma siRNAs in shoot apical meristem (stage 0), which 
were reduced or absent in mantled (Fig. 4c) and in later stages of normal 
inflorescence (Extended Data Fig. 8 and Methods). In polyembryogenic 
tissue cultures derived from normal and abnormal clonal palms 
(Methods), small RNA (sRNA) underwent a switch from 24 to 
21 nucleotides (Extended Data Fig. 9), resembling cell cultures and cal- 
lus of Arabidopsis'*. Karma methylation (Extended Data Fig. 10) and 
24-nucleotide siRNA (Fig. 4c and Extended Data Fig. 8) were reduced in 
normal cultures between two and seven passages, and lost in abnormal 
cultures, suggesting a model for the origin of mantled: if meristems are 
the source of 24-nucleotide siRNA™ (Fig. 4c), then leaf cells detached 
from the meristem would progressively lose 24-nucleotide siRNA and 
non-CG DNA methylation over time in culture. Antisense sRNA might 
influence exon trapping, as it does in humans”, or else splicing might 
be associated with changes in chromatin, for example histone H3 Lys4 
methylation’*. Good Karma would only be restored during shoot gen- 
eration if DNA methylation and small RNA were not entirely lost during 
tissue culture, or potentially if siRNAs were artificially applied during 
the tissue culture process. The 24-nucleotide siRNA could also facilitate 
non-Mendelian segregation*’, which resembles paramutation (the 
interconversion of heterozygous epialleles) in some respects'!**””. 

Despite its importance, mantling has been the elusive target of 
molecular genetic investigation for the past three decades. We have 
demonstrated that the mantled trait is a consequence of epigenetic 
modification of the Karma transposable element within the B-class 
MADS-box EgDEFI gene, which we have named MANTLED. B-class 
function in stamen identity is conserved in monocots, but similar to 
other monocot paleoAP3 genes, EgDEF1 overexpression fails to cause 
homeotic conversion in Arabidopsis*, because of a diverged C-terminal 
exon’*. Nonetheless, like AP3 and DEF”, the oil palm MANTLED 
gene is expressed in inner perianth and stamen primordia as the peri- 
anth initiates (stage 2), followed by stamen and stamenoid primordia 
(stage 3)*. The appearance of kDEFI transcripts during this transition 
(Fig. 4b) suggests that kKDEF1 functions at this crucial window to 
induce the mantled phenotype. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


Samples and tissue culture. Leaf tissue used in the EWAS discovery and valid- 
ation panels was sampled from oil palm clones (known as ramets) derived from 
tenera mother palms (known as ortets) from various genetic backgrounds. The 
female parents of the ortets were predominantly of a Deli dura background while 
the pisifera male parents were derived from La Me, Yangambi, AVROS and Binga 
genotypes. These parental lines make up the major genetic backgrounds of the oil 
palm populations in Malaysia. These palms were collected from the Malaysian 
Palm Oil Board (MPOB), United Plantations Berhad, FELDA Global Ventures 
R&D Sdn Bhd and Applied Agricultural Resources Sdn Bhd (Extended Data Fig. 
3). The ramets were derived from explants excised from non-chlorophyllous leaves 
of their respective ortets cultured on hormone-supplemented media*'*. The oil 
palm tissue culture process involves callus initiation and polyembryoid generation 
followed by shoot and root initiation”. In large scale production, the tissue culture 
process takes 48-52 months to complete. Once established, ramets undergo accli- 
matization in the nursery for 3-4 months, are moved to the field nursery for 
another 8-9 months, and finally are field planted. At every stage of the tissue 
culture process, off-types are culled. Flower and fruit bunch census are taken at 
the onset of flowering (2-3 years after field planting) and in subsequent years. 
Normal and mantled ramets were identified based on the census data collected. 

Recloned tissue culture materials were generated using the standard protocol 
described above. Normal and abnormal ramets from identical genetic back- 
grounds were subjected to the tissue culture process and sub-sampling was carried 
out at the polyembryogenic stage, namely at subculture passage two (SC2) and 
seven (SC7), representing short and prolonged exposure in culture on hormone- 
supplemented media, respectively. 

Female inflorescence samples used in staging the developmental phase were 
obtained from a total of 31 clonal palms with ages ranging from 3 to 10 years after 
field-planting. As a means to categorize the developmental phases, these inflor- 
escence samples were histologically analysed and classified as stage 0, shoot apical 
meristem; stage 2, initiation of perianth organs; stage 3, development of perianth 
organs and initiation of reproductive organs; stage 4, development of reproductive 
organs; stage 5, fully formed reproductive organs according to ref. 36. 

E. guineensis genome microarray design. The microarray design was based on 
the E. guineensis genome’ and contains more than 1 million 60-base probes. 
Probes were selected from non-overlapping 1.5-kb windows across all scaffolds 
of the P1 pisifera genome build, choosing the 60-mer with the lowest composite 
15-mer frequency count within the genome. When compared to the publically 
available EG5 genome assembly", 860,861 probes from the microarray matched at 
100% stringency (81,194 probes falling on exons and the remaining covering 
intronic and intergenic regions of the genome). The microarrays were manufac- 
tured by Roche NimbleGen using the HX1 platform. 

DNA methylation-dependent fractionation. Genomic DNA (60 [1g) was mech- 
anically hydrosheared to 1-4-kb fragments. Sheared DNA was divided into four 
equal portions, two of which were digested with 10 U ig McrBC (New England 
Biolabs) under manufacturer’s recommended conditions and the other two were 
mock-treated without adding enzyme. After digestion, DNAs were treated with 
proteinase K (50mgml~') for 1h at 50°C and then precipitated with ethanol 
under standard conditions. Resuspended DNAs were resolved by agarose gel 
electrophoresis, and DNA in the 1-4-kb size range was excised from gels and 
extracted. McrBC requires that two methylated half sites (RmC, where R = A or 
G) lie within 40-3,000 bp of each other, and cutting occurs in the proximity of one 
of the half sites*”. Because 1-4-kb fragments are treated with McrBC, and undi- 
gested fragments are isolated, hybridization of a microarray probe complimentary 
to sequence distant from the methylation site results in a ‘wingspan’ effect in which 
probes are able to detect DNA methylation from a distance up to ~1.5 kb**. For 
each fraction, 200 ng was used for cyanogen dye labelling (Cy3 or Cy5). For each 
sample, four microarrays were hybridized in a duplicated dye swap design to 
differentially labelled untreated and DNA methylation depleted fractions. 
Sample size was chosen to allow several clonal lineages including both normal 
and parthenocarpic samples from each of four independent sources, but no stat- 
istical methods were used to determine sample size. 

Microarray data processing, normalization and statistical analysis. Among the 
~860,000 microarray probes matching the oil palm genome with 100% sequence 
stringency, a subset of ~460,000 uniquely mapped probes was selected to reduce 
noise from non-specific hybridizations. For data processing, corrections on spatial 
non-uniformity of fluorescent signal intensities, done separately for the Cy3 and 
Cy5 dye channels, were made. Data were then normalized by background sub- 
traction using negative control probes, followed by scaling. After normalization, 
median log signal of the control probes for each dye of each array was set to zero. 
MAD (median absolute deviation) log signal of all the probes on an array was used 
as a constant for a given treatment (untreated or DNA methylation depleted). 
DNA methylation was measured as the sample average (over the two pairs of 


dye-swapped technical replicates) of normalized log, ratios of untreated over 
methylation-depleted DNA. Statistical analysis was first conducted within samples 
derived from each source independently. Within each group, a two-sided t-test 
was performed between normal and mantled phenotypes. Using a cut-off of 
P= 0.05, one probe that was significantly differentially methylated in all the four 
groups was identified. To confirm this finding with a different statistical approach, 
quantile normalization of methylation measurements (the sample average of nor- 
malized log ratios of untreated over methylation depleted DNA) was performed on 
all samples together, and then a t-test on all normal versus all mantled samples was 
conducted. This process identified the same probe that was found in the initial 
analysis, as well as an additional three immediately neighbouring probes. The 
experiments were not randomized, and investigators were not blinded to alloca- 
tion during experiments and outcome assessment. 

Code availability. Computer code for microarray data processing and normaliza- 
tion is available for download at http://www.oriongenomics.com/files/methylscope_ 
processing.tar. 

qPCR DNA methylation assays. Primer pairs were designed to amplify two 
Karma element regions as diagrammed in Extended Data Fig. 4a, b. A 633-bp 
amplicon included methylation-sensitive restriction sites containing CHG posi- 
tions 188 bp (BbvI) and 375 bp (ScrFI) downstream of the Karma splice site CHG. 
A 632-bp amplicon that amplified a region near the centre of the Karma element 
included a Rsal methylation-sensitive restriction site. Primer pairs were confirmed 
to amplify a single band of the correct size by agarose gel electrophoresis. For qPCR 
DNA methylation assays, 100 ng of genomic DNA was digested with 10 U pg | of 
the indicated restriction enzyme under standard conditions. An equal amount of 
genomic DNA was mock-treated in a reaction lacking enzyme. Digestion reactions 
were incubated at 37 °C for 16h. qPCR was carried out using 10 ng each of the 
mock-treated and enzyme-digested samples in 1 X Roche SYBR Green Master 
Mix on a Roche LC480 instrument. qPCR amplifications were performed in 
duplicate. For each duplicate mock/digested amplification pair, the AC, value 
was calculated as the digested C, minus the mock C, and duplicated AC, values 
were averaged. DNA methylation density was calculated as percentage dense 
methylation = 2° 4teested — mec) Samples were genotyped by restriction 
digestion of PCR amplicons with either BbvI or Rsal to confirm that all samples 
used for DNA methylation validation included intact restriction sites on both 
alleles. Enzyme digestions were quality controlled by performing qPCR assays 
monitoring three independent invariantly unmethylated endogenous genomic 
loci and one invariantly methylated endogenous genomic locus. All quality 
control passed digestions reported <5% methylation of the unmethylated con- 
trols and >95% methylation of the methylated control. Primer sequences are 
available on request. 

Whole-genome bisufite sequencing. Genomic DNA (1 pg) from each of 15 
mature leaf samples (5 lineage trios of ortet, normal ramet and mantled ramet) 
was used to construct TruSeq fragment libraries (Illumina), and up to 500 ng of 
adapted library molecules were bisulfite converted using the EZ DNA 
Methylation-Lightning Kit (Zymo Research). Each library was sequenced in one 
lane of a HiSeq 2000 flow cell to generate paired 100-bp reads. Reads were mapped 
to an in silico bisulfite converted reference E. guineensis (pisifera) genome. For 
each cytosine context (CG, CHG or CHH), the number of mapped reads corres- 
ponding to unconverted cytosines relative to the total number of reads including 
the particular base was used to calculate the percent methylation at each cytosine 
position. 

Clone based bisulfite sequencing. Because all cytosines are potential sites for 
DNA methylation in plants, and because whole-genome bisulfite sequencing 
demonstrated that CG methylation is maintained at high levels in both normal 
and mantled ramets, bisulfite sequencing amplicon primers were designed to 
include CG dinucleotides, but exclude CHG and CHH trinucleotides. Within 
primer sequences, CG dinucleotides were assumed to be methylated. The ampli- 
con (amplified from the antisense strand) contained 13 CHG sites, including the 
CHG site at the Karma splice acceptor site. Then 2 jug of each sample was bisulfite- 
converted as described for whole-genome bisulfite sequencing. In total, 30 ng 
converted DNA was used for PCR amplification in 1X HiFi Hotstart Uracil+ 
Ready Mix (Kappa). Amplicons were cloned using the TOPO TA Cloning Kit 
(Invitrogen) following A-tailing by Klenow treatment. For each sample, 48 white 
colonies were individually picked, propagated and plasmid DNA extracted. 
Plasmid inserts were PCR-amplified and Sanger-sequenced (ABI 3730) using 
vector-specific primers. Sequencing was performed on 48clones per sample, 
and reads from plasmids not including the amplicon insert are not shown. 
Sequences were base called in CONSED and methylation densities at each CHG 
site were calculated. Where possible, heterozygous non-cytosine SNPs were scored 
so that each allele could be analysed independently. In cases where a polymorph- 
ism changed a CHG site to either a CG or CHH site, the non-CHG variant was not 
included in calculations of CHG methylation. Because CHH methylation was 


©2015 Macmillan Publishers Limited. All rights reserved 


determined to be consistently very low in both normal and mantled ramets, 
conversion of CHH sites within the amplicon was used to control for bisulfite 
conversion rates. All samples analysed displayed <4% methylation of CHH sites, 
demonstrating that bisulfite conversion was >96% complete in all samples. 
Primer sequences are available on request. 

mRNA and siRNA sequencing. Transcriptome sequencing was performed on 
shoot apex, early inflorescence (<2 cm) and late stage inflorescence (three normal 
and three mantled female inflorescence biological replicates each). About 2-3 1g 
total RNA was used to construct individually barcoded Illumina TruSeq stranded 
libraries. Libraries were pooled in sets of four, and each pool was sequenced in one 
lane of a HiSeq 2000 flow cell to generate paired 100-bp reads. sRNA fractions of 
female shoot apex tissue at stage 0 and female inflorescence tissue at stages 2, 3, 4 and 
5 (7 mantled and 5 normal biological replicates at stage 0, 6 mantled and 8 normal 
biological replicates each at stages 2 and 3, 7 mantled and 5 normal biological 
replicates at stage 4, and 5 mantled and 4 normal biological replicates at stage 5), 
as well as second-passage tissue cultures recloned from mantled (n = 1) or normal 
(n= 2) ramets, were used to construct Illumina TruSeq sRNA libraries and 
sequenced following the same strategy as mRNA sequencing. mRNA sequencing 
data was used to construct gene models for all observed EgDEF1 alternative tran- 
scripts. siRNA reads mapping to the genomic scaffold including EgDEF1 were 
identified and normalized as fragments per 1,000 reads mapped to the genome 
(FPKM). FPKM values for each 24-mer were compared between biological replicates 
of normal and mantled samples by a two-tailed Student’s t-test, assuming equal 
variance. 

qRT-PCR assays. To specifically quantify cDEF1 expression, a forward primer 
spanning the junction of exons 1 and 2 was used with a reverse primer within 
exon 7 (Extended Data Fig. 7b-d). The same forward primer was used with a 
reverse primer including intron 5 sequence to specifically quantify tDEF1 express- 
ion. Finally, a forward primer spanning the junction of exons 4 and 5 was used with 
a reverse primer within Karma, downstream of the exon 5/Karma splice junction, 
to specifically quantify KDEF1 expression. Assays were optimized using normal 
and mantled late stage inflorescence total RNA, and cDNAs were Sanger 
sequenced to confirm the identity of the amplicons. Standard curves generated 
from serially diluted cDNAs were generated for each primer pair, as well as for two 
internal oil palm reference gene assays” (Extended Data Fig. 7e, f). Gene express- 
ion was quantified in developing inflorescence stages 0, 2, 3, 4 and 5. All first- 
strand cDNA reverse transcription reactions were performed from 1 pg total RNA 
using a cocktail of reverse primers specific to EgDEF1 exons 6 and 7, as well as 3’ 
regions of Karma. For each stage, three technical replicates were performed for 
each of the three biological replicates per phenotype, per stage. (RT-PCR reac- 
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tions were performed using 1 jl first-strand cDNA in 1X Roche SYBR Master Mix 
ona Roche LC480 instrument. Cycle thresholds above 33 cycles were not included 
in calculations, and detectable expression was calculated only for samples in which 
expression was detected in at least 2 of 3 technical replicates. Expression levels were 
quantified by extrapolation from the standard curve for each assay, and expression 
levels relative to the reference gene were calculated. Primer sequences are provided 
in Extended Data Fig. 7d. 

We found only a subtle decrease in expression of cDEF1 in female mantled 
relative to normal inflorescence at stage 3 (Fig. 4b). A more significant decrease 
was previously reported® but was not detected consistently. More recently, an 
increase in the ratio of (DEF1 to tDEF1 + cDEFI in samples from early and late 
time points at each stage of inflorescence development was reported’*. However, 
when absolute values of (tDEF1/tDEF1 + cDEF1) from early and late samples 
from any given stage are pooled (see figure 6b and Supplementary figure 8b in 
ref. 12) then only modest increases in expression are observed in mantled samples 
in agreement with our results (Fig. 4b). 
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Extended Data Figure 1 | Spikelets from clonal palms of different fruit form phenotypes. a, Spikelets from a normal ramet. b, Spikelet from a fertile mantled 
ramet. c, Spikelet from a parthenocarpic mantled ramet. d, Spikelet from a revertant ramet displaying both normal (N) and mantled (M) fruits in the same spikelet. 
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Extended Data Figure 2 | Annotation of genome-wide differentially according to annotations of genomic elements mapped within 3 kb of the 
methylated loci. Sequences of microarray features reporting significant microarray feature sequence, as this is the distance at which McrBC is capable 
differential DNA methylation between fully normal and fully mantled leaf of monitoring DNA methylation density. The repeat class includes all 
DNA samples of one or more clonal lineages (P < 0.05, two-sided Student’s repetitive sequences, including transposons and pisifera-specific repetitive 
t-test, Methods) were mapped to the reference E. guineensis pisifera genome. _sequences’*. Features mapping within 3 kb of both a gene and a repeat were 
Numbers of biological replicates per clonal lineage are provided in assigned to both classes. The number of features reporting hypermethylation 
Extended Data Fig. 3. Features were assigned to gene and repeat classes (red) and hypomethylation (green) are plotted. 
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Extended Data Figure 3 | Summary of DNA methylation changes predicted _ transcription. Features reporting significant hypomethylation or 
by EWAS within clonal lineages. Rows indicate independent clonal hypermethylation in mantled relative to normal clones are indicated as black 


lineages from four oil palm industry sources (source A-D, as indicated in and grey boxes, respectively. White boxes indicate features reporting no 
Fig. le). Clone lineages map to sources as follows: 7-9, source A;1and2,source _ significant DNA methylation difference. Only clonal lineages including more 
B; 3-6, source C; 10 and 11, source D. The numbers of fully normal and fully _ than 1 ramet per phenotype are shown to determine statistical significance 


mantled palms per lineage represented are indicated to the left. Columns within each clonal lineage (n = 41 normal; n = 37 parthenocarpic mantled). 
represent each microarray feature mapping to the EgDEF1 (open box at top) |= Ramets from four additional clonal lineages were included in the source-by- 
and upstream region. The relative positions of Rider, Karma and Koala source analysis shown in Fig. le. 


elements are indicated. The arrow indicates the direction of EgDEF1 
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Extended Data Figure 4 | DNA methylation assays and supporting DNA 
methylation data. a, Diagram of the EgDEFI gene including the Karma 
element within intron 5 (orange box). Black boxes represent exons and the 
horizontal line represents introns. Scale bar is in base pair units. b, Blue tick 
marks represent the relative positions of the four microarray features 
reporting significant hypomethylation of mantled clones in all source lineages. 
The left-most feature includes the Karma splice acceptor site. Horizontal lines 
labelled B (BbvI) and R (Rsal) indicate the relative positions of amplicons 
used for qPCR-based CHG methylation assays. The BbvI amplicon also 
includes a ScrFI site (S) used in d. The relative position of the bisulfite 
sequencing amplicon used to determine Karma splice site CHG methylation is 
shown below the qPCR amplicons. c, Diagrams of the three alternatively spliced 
EgDEF1 transcripts. Black boxes represent exons included in each transcript. 
The dotted lines represent intronic sequences spliced out of the mature 
mRNA transcripts. The red box represents Karma element sequence spliced to 
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EgDEFI exon 5 in the KDEF1 transcript. The blue box represents EgDEF1 
intron 5 sequence included in the tDEF1 transcript that does not use the exon 5 
splice donor site. d, In addition to adult leaf samples analysed by BbvI and 
Rsal qPCR assays (Fig. 2c), 37 samples were found to have a SNP in the BbvI site 
and were therefore analysed by ScrFI and RsaI qPCR assays (Methods and 
Extended Data Fig. 4b). Linear discriminant analysis was performed between 
normal (n = 14) and mantled (n = 22 parthenocarpic mantled; n = 1 fertile 
mantled) samples. Combining these results with those shown in Fig. 2c, 
sensitivity and specificity for detection of mantling are each 94%. e, Bisulfite 
sequencing of controls, FN1 and FN2 (Fig. 2c). mCHG density was calculated 
for the three CHG sites covered by the unique common microarray feature 
(Figs le and 2d-g). FN1, FN2 and the mantled control were significantly 
hypomethylated relative to the normal control (*P < 0.0001, two-tailed 
Fisher’s exact test). 
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Extended Data Figure 5 | Clone based bisulfite sequencing maps of normal __revertant clone yielding 99% normal fruit. Alleles were analysed independently 
and mantled phenotype fruits from epigenetic mosaics. The heatmap based on a SNP not affecting a potentially methylated base. Statistical 

format is as described in Fig. 2d-g. Grey boxes indicate a sitein whichaSNP on analyses of methylation at the three CHG sites spanning the Karma splice site 
allele a results in a CHG to CHH site conversion. Mosaic clone 1 representsa are shown in Fig. 3e, f. 

revertant clone yielding 95% normal fruit. Mosaic clone 2 represents a 
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Extended Data Figure 6 | CHG methylation in rachis sectors of an oil palm 
yielding 7% normal fruit (clone lineage 2 in Fig. 3d). Rachis of three 
successive fronds was dissected into 8 equal sectors. DNA methylation in each 
sector per frond was measured by BbvI and ScrFI assays, as described in 
Methods. Average DNA methylation density measurements of three technical 
replicates per frond, per sector, per assay are plotted on a radial graph 
representing the 8 rachis sections around the palm trunk (ScrFI assay, light 
blue; BbvI assay dark blue). Sector numbering was ratcheted for frond 2 versus 


1, and frond 3 versus 2 based on the R’ best fit of CHG methylation density 
around the circumference of the palm to correct for out-of-register numbering 
of rachis sectors between successive fronds (data not shown). Consistent 
with the fact that this oil palm yields only 7% normal phenotype fruit, most 
DNA methylation measurements are consistent with the mantled phenotype. 
However, sectors 8 and 2 display gains of CHG methylation in rachis sectors 
of all three fronds, and reach or approach normal levels in sectors 8 and 2 of 
frond 2, thus demonstrating mosaicism directly. 
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Extended Data Figure 7 | Protein sequences and summary of qRT-PCR 
assay designs. a, Residues highlighted in red are encoded by Karma sequence 
splice to exon 5 of EgDEF1. The alternate splicing event disrupts the 
transcription activation domain of EgDEF1. Twelve variant amino acids are 
coded by Karma sequencing, followed by a stop codon. b, Diagram of EgDEFI 
locus including positions of RT-PCR primers. cDEF1 transcripts were 
detected using primer a (spanning the splice junction of exons 1 and 2) and 
primer c (internal to exon 7). KDEF1 transcripts were detected using primer b 
(spanning the splice junction of exons 4 and 5) and primer d (internal to Karma 
ORE2). {DEF 1 transcripts were detected using primer a and primer e (spanning 
the 3’ end of exon 5 and including tDEF1-specific intron 5 sequence). ¢, All 
assays were confirmed to give a single band of the correct size by agarose gel 
electrophoresis. Amplicons were Sanger sequence verified. Note that no band is 
amplified using the kDEF1 primer pair in samples from normal inflorescence, 
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consistent with lack of expression of KDEF1 in normal inflorescence. 

d, Sequences of primers diagrammed in b. e, f, Standard curves for gRT-PCR 
assays. PCR amplicons including each qRT-PCR amplified sequence were 
serially diluted and quantified in triplicate by qPCR using the indicated primer 
pairs. Dilutions (x axis) were plotted against the measured cycle threshold (y 
axis). e, Standard curves for cDEF 1 (blue), KDEF1 (red) and tDEFI (green). Line 
equations were used to calculate the efficiency of each primer pair. The 
efficiency of each primer pair was used in calculations for quantification of 
expression of each associated transcript. f, Standard curves for two endogenous 
oil palm control genes. The efficiency of each primer pair was used in 
calculations for quantification of expression of each associated transcript. 
Expression of each alternative transcript was calculated relative to the control 
PD00569 control. Control qRT-PCR primers are described previously”. 
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Extended Data Figure 8 | Antisense 24-nucleotide siRNA analysis of siRNAs that are not significantly differentially expressed between normal 
inflorescence development. siRNA expression at inflorescence stages 0 (shoot __ relative to mantled tissues (P > 0.05, Student’s t-test, two-tailed assuming equal 
apical meristem), 2, 3, 4 and 5, was analysed by Illumina siRNA sequencing variance). Differentially expressed 24-nucleotide siRNAs are plotted as green or 
(Methods). FPKM normalized expression values for each measured red bars for normal or mantled tissues, respectively. Bars above and below the 
24-nucleotide siRNA are plotted in scale with the genomic elements zero line represent sense and antisense siRNAs, respectively, and are plotted on 
diagrammed at the top of the figure. Grey bars indicate detected 24-nucleotide _ the same scale in both directions. 
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Extended Data Figure 9 | Relative abundance of 21- and 24-nucleotide normal reclone (blue) and stage 0 inflorescence (red). Read lengths of sRNA 
sRNA in normal and mantled reclones and stage 0 inflorescence. sequencing reads are plotted as the percentage of total reads for each 
a, Distribution of sRNA lengths derived from mantled reclone (blue) and incremental sRNA nucleotide length. 


stage 0 inflorescence (red). b, Distribution of sRNA lengths derived from 
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Extended Data Figure 10 | CHG methylation in recloned tissue cultures. higher CHG methylation than those derived from mantled ramets. In both 
Tissue cultures were reconstituted from normal and mantled ramets from two _ normal and mantled reclones, CHG methylation generally decreased with 
clonal lineages (‘clones of clones’). Methylation at three CHG sites across time in culture. At SC2, the time point at which 24-nucleotide siRNAs were 
the Karma DMR was quantified by qPCR assays at two (SC2) and seven (SC7) —_ measured (Fig. 4c), the culture from normal ramet lineage 1 had lost 
passages in tissue culture. Cultures derived from normal ramets displayed methylation at the BbvI (the site nearest the Karma splice acceptor site). 
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BET inhibitor resistance emerges from 


leukaemia stem cells 
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Bromodomain and extra terminal protein (BET) inhibitors are 
first-in-class targeted therapies that deliver a new therapeutic oppor- 
tunity by directly targeting bromodomain proteins that bind acety- 
lated chromatin marks'”. Early clinical trials have shown promise, 
especially in acute myeloid leukaemia’, and therefore the evaluation 
of resistance mechanisms is crucial to optimize the clinical efficacy of 
these drugs. Here we use primary mouse haematopoietic stem and 
progenitor cells immortalized with the fusion protein MLL-AF9 to 
generate several single-cell clones that demonstrate resistance, 
in vitro and in vivo, to the prototypical BET inhibitor, I-BET. 
Resistance to I-BET confers cross-resistance to chemically distinct 
BET inhibitors such as JQ1, as well as resistance to genetic knock- 
down of BET proteins. Resistance is not mediated through increased 
drug efflux or metabolism, but is shown to emerge from leukaemia 
stem cells both ex vivo and in vivo. Chromatin-bound BRD4 is 


globally reduced in resistant cells, whereas the expression of key 
target genes such as Myc remains unaltered, highlighting the exist- 
ence of alternative mechanisms to regulate transcription. We dem- 
onstrate that resistance to BET inhibitors, in human and mouse 
leukaemia cells, is in part a consequence of increased Wnt/f-catenin 
signalling, and negative regulation of this pathway results in restora- 
tion of sensitivity to I-BET in vitro and in vivo. Together, these 
findings provide new insights into the biology of acute myeloid 
leukaemia, highlight potential therapeutic limitations of BET inhi- 
bitors, and identify strategies that may enhance the clinical utility of 
these unique targeted therapies. 

Our increasing knowledge of cancer genomes and epigenomes 
not only implicates epigenetic regulators in the initiation and main- 
tenance of cane but also highlights an opportunity for therapeutic 
intervention**. One of the most promising epigenetic therapies to have 
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Figure 1 | Establishment of a model of BET inhibitor resistance. a, Strategy 
for the generation of resistant clones. HSPC, haematopoietic stem and 
progenitor cells. b, Resistance to I-BET demonstrated in cell proliferation 
assays performed in biological triplicate (mean + s.d.). c, Resistance to JQ1 
demonstrated in cell proliferation assays performed in biological triplicate 
(mean ~ s.d.). d, Resistance to I-BET in clonogenic assays performed in 
biological duplicate (mean + s.e.m.). e, Resistance to I-BET-mediated 


induction of apoptosis in biological triplicate experiments (mean + s.e.m.). 

f, Resistant clones do not demonstrate cell cycle arrest in biological triplicate 
experiments (mean + s.e.m.). g, Resistance to shaRNA-mediated knockdown of 
Brd4 in biological duplicate experiments (mean + s.e.m.). h, i, Kaplan-Meier 
curve of secondary syngeneic transplant of sensitive (h) and resistant (i) clones 
(n = 6 per group, statistical significance calculated using a log-rank test). 
Dotted line denotes treatment starting on day 9. 
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emerged in the past decade are small molecule inhibitors targeting the 
bromodomains of BET family proteins (BRD2, BRD3, BRD4 and 
BRDT)'”. While these non-catalytic inhibitors are currently being 
evaluated in clinical trials across a range of malignancies, the molecular 
and cellular mechanisms that govern sensitivity and resistance remain 
largely unknown. We and others have previously demonstrated the 
pre-clinical efficacy of BET inhibitors in acute myeloid leukaemia 
(AML)**, and early clinical evidence has reinforced the potential of 
these drugs’. 

To study BET inhibitor resistance in a model of AML, we trans- 
duced mouse bone marrow haematopoietic stem and progenitor 
cells (HSPCs) with MLL-AF9. After a selection period in cytokine- 
supplemented methylcellulose in the presence of dimethylsulfoxide 
(DMSO; vehicle) or I-BET at the IC4) value (40% of maximal inhib- 
itory effect concentration) of these cells (400 nM), we isolated indi- 
vidual blast colonies, each derived from a single cell, to generate four 
independent vehicle-treated and five independent I-BET-resistant cell 
lines (Fig. 1a). The selection pressure on I-BET-resistant clones was 
sequentially increased to establish clones stably growing at various 
concentrations including those greater than the ICgo value of the par- 
ental and vehicle-treated cells (Fig. 1a, b and Extended Data Fig. 1a). 
While chemically distinct inhibitors directed against the same target 


Intracellular Supernatant b 
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have sometimes overcome resistance’, our data indicates that resist- 
ance to I-BET also confers cross-resistance to the chemically distinct 
BET inhibitor JQ1 (ref. 10) (Fig. 1c and Extended Data Fig. 1b). 

Direct comparison of these cell lines demonstrated that although 
vehicle-treated cells remained exquisitely sensitive to I- BET-mediated 
suppression of clonogenic capacity, induction of apoptosis and cell 
cycle arrest, the resistant cells were now impervious to these estab- 
lished phenotypic responses at levels that positively correlated with the 
degree of selective pressure applied (Fig. 1d—f and Extended Data Fig. 
1c). High-content short hairpin RNA (shRNA) screens in this AML 
model previously identified Brd4 as the major therapeutic target of 
BET inhibitors®. Using an inducible shRNA system, we were able to 
replicate these findings in our vehicle-treated clones; however, BET- 
inhibitor-resistant clones were significantly less susceptible to genetic 
depletion of Brd4 (Fig. 1g and Extended Data Fig. 1d-h). 

Consistent with our previous data’, I-BET leads to a significant 
survival advantage in this AML model (Fig. 1h). By contrast, this 
survival advantage is abrogated following an identical treatment strat- 
egy in recipients of resistant cells (Fig. 1i). No differences in morpho- 
logy or pattern of disease between sensitive or resistant cells were 
observed (Extended Data Fig. li and data not shown). Together, these 
findings establish a robust model of BET inhibitor resistance in vitro 
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Figure 2 | Resistance to BET inhibitors arises from the LSC compartment. 
a, Intracellular and extracellular concentrations of I-BET as assessed by 
quantitative mass spectrometry in biological duplicates (mean + s.e.m., 
statistical significance calculated using a two-tailed Student’s t-test). NS, not 
significant. b, Resistant clones demonstrate an immature immunophenotype 
(Grl /CD11b_ ). ¢, Clonogenic capacity of the Grl~/CD11b” and 
Gr1*/CD11b* populations in resistant clones performed in biological 
duplicate (mean + s.e.m.). d, Limiting dilution transplantation analyses 
Kaplan-Meier curves of C57BL/6 mice injected with indicated number of cells; 
detailed cohort and survival data can be found in Extended Data Fig. 3a. e, LSC 


frequency from limiting dilution transplantation analyses. Dotted lines indicate 
95% confidence intervals (CI). f, L-GMP frequency in whole mouse bone 
marrow (mean = s.e.m.) after serial transplantation of I-BET-exposed 
leukaemias from in vivo resistance model. Statistical significance of survival 
outcomes determined using log-rank test of Kaplan-Meier survival estimates. 
g, Proportion of human leukaemic CD34* cells, GMPs and LMPPs in 

whole mouse bone marrow (mean = s.e.m.) after I-BET exposure in an AML 
PDX model (n = 5). h, -BET-naive L-GMPs do not demonstrate intrinsic 
resistance to I-BET in clonogenic assays in biological duplicate experiments 
(mean + s.e.m.), see also Extended Data Fig. 5b. M1, mouse 1. 
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and in vivo, and show that resistant cells are refractory to either chem- 
ical or genetic perturbation of Brd4. 

Major mechanisms of drug resistance include reduced drug influx 
or increased drug efflux'’. To address this issue, we performed quant- 
itative mass spectrometry, which revealed no significant difference in 
the amount of intracellular or extracellular drug (Fig. 2a). However, we 
noted that resistant cells were smaller and more homogenous by flow 
cytometry (Extended Data Fig. 1j), and further immunophenotypic 
characterization of sensitive and resistant cells revealed marked differ- 
ences in the expression of the lineage markers Grl and CD11b 
(Fig. 2b). These findings were replicated in an independent MLL- 
ENL model of BET inhibitor resistance (Extended Data Fig. 2). 

While the precise immunophenotype of leukaemia stem cells (LSCs) 
in mouse MLL leukaemia models has been debated’*"', it has prev- 
iously been shown that LSC potential primarily resides in the more 
immature, lineage-negative (Lin’, Sca” , cCKit", CD34, FcyRII/RIII*) 
leukaemic granulocyte-macrophage progenitor (L-GMP) population, 
raising the possibility that BET-inhibitor-resistant cells are enriched for 
LSCs'*'*"°, Consistent with this notion, we noted a significant increase 
in the blast colony forming potential of the Lin’ (Grl /CD11b ) 
population, and a marked increase in L-GMP cells in our resistant 


population before primary transplantation (Fig. 2c and Extended 
Data Fig. 1k). 

While primary transplantation of vehicle-treated cells paralleled the 
natural history of this AML model, remarkably, primary transplanta- 
tion of I-BET-resistant cells resulted in considerably shorter leukaemia 
latency (Fig. 2d). Moreover, limiting dilution transplantation analyses 
confirm that I-BET-resistant cells were markedly enriched for LSC 
potential (Fig. 2d, e and Extended Data Fig. 3a). To assess the relevance 
of these findings to resistance that emerges in vivo after sustained 
exposure to I-BET, we derived an independent in vivo model of 
I-BET resistance (Extended Data Fig. 3b, c). These data validated 
findings from the ex vivo model, and show that in vivo BET-inhibitor 
resistance also emerges from an L-GMP/LSC population (Fig. 2f and 
Extended Data Fig. 3d-g). Importantly, these I-BET-resistant AML 
cells have a functional LSC frequency of approximately 1:6; this is 
virtually identical to what has previously been reported for a purified 
L-GMP population”. 

To extend these findings into primary patient samples we treated a 
patient-derived xenograft (PDX) model of AML with I-BET. While the 
immunophenotype of human AML LSCs can be variable’®, several 
PDX models have shown that LSCs are enriched within CD34* 
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Figure 3 | Genetic, epigenetic and transcriptional characterization of BET- 
inhibitor-resistant cells. a, Proliferation assays in sensitive, resistant cells 
maintained in I-BET and after 8 weeks drug withdrawal (mean = s.d., n = 12 
per group). b, Cell cycle profile in resistant clones after drug withdrawal 
performed in biological triplicate experiments (mean + s.e.m.). ¢, Immuno- 
phenotype of sensitive, resistant and drug-withdrawal cells. d, Whole-exome 
capture sequencing data from vehicle-treated and resistant clones normalized 
to the parental cell line; red regions denote copy number loss and green 
regions denote copy number gain. e, Brd4 binding profiled across all annotated 
transcriptional start sites (TSSs). f, Brd4 binding and histone 3 Lys 27 
acetylation (H3K27ac) at Myc enhancer elements. g, Principle component (PC) 


540 | NATURE | VOL 525 | 24 SEPTEMBER 2015 


analysis of parental cells, vehicle-treated clones (n = 4), resistant clones (n = 9) 
and resistant clones after drug withdrawal (n = 2). Parentheses denote 
concentration of I-BET (nM) in which resistant clones have been stably 
maintained. h, GSEA identifies enrichment of a published LSC signature in 
resistant clones. Upregulated and downregulated genes in the published LSC 
signature are shown in red and blue, respectively, and correlate with 
upregulated and downregulated (false discovery rate (FDR) < 5.0 X 10°) 
genes in the I-BET-resistant clones. i, Statistically significant upregulation 
(shaded red) of the WNT/B-catenin and TGF-B pathways and downregulation 
(shaded blue) of the NF-kB pathway is observed in all resistant clones (n = 11). 
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cells'*’’, which immunophenotypically parallel GMPs or lymphoid- 
primed multipotent progenitors (LMPPs)'*. Consistent with the data 
from our mouse AML models, we find that I-BET treatment enriches 
for the leukaemic LMPP population (Fig. 2g and Extended Data Fig. 4). 

To understand whether LSCs were intrinsically resistant to I-BET, 
we sorted L-GMPs from mice that were I-BET-naive, and challenged 
them with 1 1M of I-BET in clonogenic assays. While this dose vir- 
tually eradicates the clonogenic potential of I-BET-naive bulk leuk- 
aemia cells (Fig. 1d and ref. 7), between 30 and 40% of L-GMPs are able 
to survive (Fig. 2h and Extended Data Fig. 5). Moreover, initial treat- 
ment with I-BET in vivo does not result in an immediate increase in 
L-GMPs, instead this population progressively emerges with continu- 
ous and sustained exposure to drug in vivo (Fig. 2f). These findings 
suggest that immunophenotypically homogenous L-GMPs/LSCs 
show marked heterogeneity in their response to I-BET, and that not 
all L-GMPs are intrinsically resistant to BET inhibitors. 

We next sought to understand whether BET inhibitor resistance was 
reversible in the absence of continuing selective pressure with I-BET. 
Surprisingly we find that BET inhibitor sensitivity was only partially 
restored (Fig. 3a, b), and these cells only partially reacquire the immuno- 
phenotype of sensitive I-BET-naive cells (Fig. 3c). Moreover, transcrip- 
tionally, they also adopt an intermediate state between sensitive cells and 
those resistant to I-BET above the IC¢o value of the drug (Fig. 3g). 

To explore the molecular aetiology for BET inhibitor resistance 
further, we initially performed whole-exome sequencing in the par- 
ental and two separate vehicle/I-BET-resistant cell lines (Fig. 3d and 
Extended Data Fig. 6). Similar to human leukaemias driven by MLL 
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fusion proteins’’, these mouse leukaemia cells do not demonstrate 
significant genomic instability (Extended Data Fig. 6). Notably, 
although independently established resistant clones behaved ident- 
ically in all functional analyses described above, there were no gate- 
keeper mutations in the bromodomains of Brd2/3/4, and no shared 
copy number aberrations. Moreover, only a few mutations with no 
apparent functional relevance to AML and/or BET activity were 
shared across several resistant cell lines (Fig. 3d and Extended Data 
Fig. 6a, b). 

We, and others, have shown that treatment with BET inhibitors 
results in incomplete displacement of Brd2, Brd3 and Brd4 from chro- 
matin®”. Similarly, we noticed that resistant cells stably growing in 
I-BET also showed a decrease in chromatin-bound Brd2, Brd3 and 
Brd4 (Fig. 3e and Extended Data Fig. 7a, b). Notably, however, we 
found that key Brd4 target genes such as Myc were equally expressed 
in resistant cells despite loss of Brd4 from functional Myc enhancer 
elements (Figs 3f and 4g). These findings raised the prospect that 
alternative compensatory transcriptional programmes were active in 
BET-resistant cells. 

Global transcriptome analyses using two distinct methodologies 
showed a very high degree of correlation, and highlighted several 
transcriptional changes that clearly distinguished sensitive from res- 
istant cells (Fig. 3g and Extended Data Fig. 7c, d). Notably, and con- 
sistent with our functional data, gene set enrichment analyses (GSEA) 
of our resistant cells strongly overlapped with previously published 
transcriptome data of LSCs from this AML model’*"* (Fig. 3h and 
Extended Data Fig. 7e-j). To identify precise transcriptional 


a 100: b 100 c Vehicle Resistant Syiieray 
80: 7 0.4 
aa g conm- 0.02 0 ~-0.02 -0.01 -0.01 0.06 0.02 -0.01 -0.01 03 
40: § 50 20nM- 0 0.02 -0.05 -0.02 -0.01 20 0.07 0.05 0.01 -0.01 si 

] 0.0 
} 2 a 67 nM- 0.01 -0.01 -0.05 -0.03 -0.01 oA 
0 , 0 . > : 
-103 0 103 104 105 -103 0 10% 104 108 P< 0.001 od EA icy ; e 
Gri CD11b 0 10 20 30 40 = 50 7 ; : oe 2 
OResistant BResistant + Dkk1 Days pM- 0.03 -0.01 0 2 
+ Dkk1 (vehicle) + Dkkt (I-BET) £ 
d 100 128M 64 nM 320 nM 1.6uM @uM 12.8nM 64 nM 320nM 1.6uM By [-BET] 
= @ & 4, ApeshRNA #5011 f 0.03 g 15 i 2o4 
$50 za 2 2g 
= S & ° 
= ze 3 S& 0.02 23 1.0 ooE, 
B os 2 3 a 8 goo 
o 2 2 2 ae 230 
23 | a 2 0.01 @505 a 29 
04 5 5 se al = ® ass 2 
0 20 40 ce ot 0 0.0 of 7 
Days 0357 0357 0357 MycTSS Myc Myc e2,] F=058,° 9 
= pees INS]ns Days enhancer = P =0.028 
= Pyrvinium. P=0.01 Vehicle @600 nM I-BET a1 M I-BET @ Vehicle @ Vehicle 35 oO 5 
+ Combination een B Resistant @ Resistant PCI (50.6%) 
Vehicle Resistant Vehicle Resistant + Dkk1 a Resistant + Dkk1 a Resistant + Dkk1 
h Brd4 Brd4 off B-catenin B-catenin P-catenin : 
j J ete 
— — —> “sO” 
bt 4 oe © Leukaemic blasts 
i Emergence of Inhibition of - OG Sensitive L-GMP 
ee @ Resistant L-GMP 
treatment treatment WNT/B-cat- esistan 
resistance enin pathway “> 


-5kbA+5kb -5kb A45kb -5kb A+5kb -5kbA+SKb -SkbAGSKD 


Figure 4 | WNT/B-catenin signalling regulates sensitivity to BET 
inhibition. a, Dkk] results in re-expression of differentiation markers 
(Gr1*/CD11b*). b, Kaplan-Meier curve of vehicle and I-BET-treated mice 
after syngeneic transplantation of resistant clone stably transduced with Dkk1 
(n = 10 per group, statistical significance calculated using a log-rank test). 
Dotted line denotes treatment starting on day 16. c, Heat map representation of 
Bliss interaction index across five-point dose range of pyrvinium and I-BET 
performed in biological quadruplicate. d, Kaplan-Meier curve of I-BET 
with/without pyrvinium after syngeneic transplantation of resistant cells. 
Shaded area denotes active treatment between days 9 and 26 (vehicle n = 7, 
I-BET n= 5, pyrvinium n = 5, combination n = 5, statistical significance 
calculated using a log-rank test). e, Viable, shaRNA-positive cells after treatment 


R, catenin 


with either vehicle or I-BET normalized to day 0 performed in biological 
quadruplicate (mean + s.d.). f, Binding of B-catenin at Myc TSS and 
enhancer elements in vehicle-treated cells, resistant cells and resistant cells 
with Dkk1. Mean enrichment relative to input (+s.e.m.) in chromatin 
immunoprecipitation (ChIP) analysis from biological triplicate experiments. 
g, Myc expression from biological triplicate experiments (mean + s.d.). h, Heat 
map representation of Brd4 and B-catenin chromatin occupancy ranked 
according to amount of Brd4 binding in vehicle-treated clones. i, Correlation 
between aggregate relative expression of examined f-catenin pathway genes 
with responsiveness to I-BET therapy. Statistical significance determined using 
Pearson’s correlation. j, Schematic model proposing the mechanism of 
resistance to BET inhibitor therapy in AML. 
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programmes differentially expressed, we performed GSEA for major 
signalling pathways. These findings demonstrated that the NF-«B 
pathway was significantly downregulated, whereas both the TGF-B 
and Wnt/B-catenin pathways were significantly upregulated in our 
resistant cells (Fig. 3i). 

We focused our attention on the Wnt/B-catenin pathway as several 
components from ligand receptors to transcriptional co-activators 
were noted to be transcriptionally upregulated (Extended Data 
Fig. 8a). Interestingly, this pathway has previously been shown to be 
a major protagonist involved in sustaining LSCs in these models of 
AML"*! and in other cancer stem cells”” (Extended Data Fig. 8b). To 
antagonise Wnt/-catenin signalling specifically, we overexpressed the 
Dickkopf Wnt signalling pathway inhibitor 1 (Dkk1), which resulted 
in the differentiation of our resistant cells into more mature leukaemic 
blasts (Fig. 4a and Extended Data Fig. 8f) and re-instated sensitivity to 
I-BET both in vitro and in vivo (Fig. 4b and Extended Data Fig. 8c-h). 
In support of these findings, pyrvinium, an established inhibitor of 
the Wnt/B-catenin pathway”’, phenocopied these results (Fig. 4c, d 
and Extended Data Fig. 8i-m). Importantly, stimulation of the Wnt/ 
B-catenin pathway in sensitive cells, by downregulation of the adeno- 
matous polyposis coli (Apc) gene, confers rapid I-BET resistance 
(Fig. 4e and Extended Data Fig. 9), further highlighting the crucial 
influence of this pathway on BET inhibitor efficacy. 

Mechanistically, we find that in I-BET-naive cells, Brd4 is bound to 
the cis-regulatory elements of target genes such as Myc (Fig. 3f), whereas 
B-catenin is essentially absent (Fig. 4f). However, in I-BET-resistant 
cells, Brd4 binding is decreased (Fig. 3f), but B-catenin is now bound 
at these sites and able to sustain the expression of Myc (Fig. 4f, g). 
Negative regulation with Dkkl reduces chromatin-bound B-catenin 
and subverts its ability to maintain the expression of Myc (Fig. 4f, g 
and Extended Data Fig. 8h). Analogous to the events at Myc, we find 
that in the resistant cells, chromatin occupancy of B-catenin increases at 
the sites where Brd4 is displaced from chromatin, and this increased 
B-catenin occupancy is abrogated by the expression of Dkk1 (Fig. 4h). 

We have previously shown that BET inhibitors have a broad range 
of efficacy against human AML samples®”. To explore the translational 
relevance of our findings, we compared baseline expression of WNT/ 
B-catenin target genes to the degree of I-BET-induced apoptosis in 
these samples (Extended Data Fig. 10a). Notably, we find a high degree 
of correlation (Fig. 4i and Extended Data Fig. 10), supporting our 
findings that increased activity of the WNT/f-catenin pathway 
negates the effects of BET inhibition. 

New classes of anti-cancer therapy rarely emerge, and BET inhibi- 
tors have uncovered a new therapeutic precedent; the possibility of 
specifically targeting epigenetic readers (effector proteins that recog- 
nize specific epigenetic modifications on histones or nucleotides). If 
their early clinical promise is to be realized, it is imperative that we 
evaluate their limitations and mechanisms of resistance to identify 
rational strategies that enhance their efficacy. Using models that have 
recapitulated the hierarchical structure of AML in vitro and in vivo, we 
show that BET inhibitor resistance emerges from LSCs with increased 
expression of the Wnt/B-catenin pathway. While not all LSCs are 
intrinsically resistant, it is clear that a small proportion of these 
are either transcriptionally primed or display rapid transcriptional 
plasticity to survive the initial BET inhibitor challenge, these cells 
subsequently thrive and become the dominant population (Fig. 4j). 
This adaptive transcriptional plasticity is an emerging theme by 
which malignant cells are able to escape from therapeutic pressures”, 
and our findings are consistent with another report highlighting the 
WNT/B-catenin pathway as a mechanism to circumvent BET inhibi- 
tion’’. Our approach has allowed us to sustain a highly enriched popu- 
lation of LSCs in culture indefinitely, providing a unique resource to 
characterize LSCs molecularly and enable screening of a range of 
therapies that may ultimately deliver the opportunity to eradicate 
the LSC population. 


542 | NATURE | VOL 525 | 24 SEPTEMBER 2015 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 


Received 4 November 2014; accepted 3 July 2015. 
Published online 14 September 2015. 


1. Dawson,M.A., Kouzarides, T. & Huntly, B. J. Targeting epigenetic readers in cancer. 
N. Engl. J. Med. 367, 647-657 (2012). 
2. Shi, J. & Vakoc, C. R. The mechanisms behind the therapeutic activity of BET 
bromodomain inhibition. Mol. Cell 54, 728-736 (2014). 
3. Herait, P. E.eta/. BET-bromodomain inhibitor OTXO15 shows clinically meaningful 
activity at nontoxic doses: interim results of an ongoing phase | trial in hematologic 
malignancies. Cancer Res. 74, CT231 (2014). 
4. Dawson, M.A. & Kouzarides, T. Cancer epigenetics: from mechanism to therapy. 
Cell 150, 12-27 (2012). 
5. Helin, K. & Dhanak, D. Chromatin proteins and modifications as drug targets. 
Nature 502, 480-488 (2013). 
6. Dawson, M.A. et al. Recurrent mutations, including NPM1c, activate a BRD4- 
dependent core transcriptional program in acute myeloid leukemia. Leukemia 28, 
311-320 (2014). 
7. Dawson, M.A. et al. Inhibition of BET recruitment to chromatin as an effective 
reatment for MLL-fusion leukaemia. Nature 478, 529-533 (2011). 
8. Zuber, J. etal. RNAi screen identifies Brd4 as a therapeutic target in acute myeloid 
eukaemia. Nature 478, 524-528 (2011). 
9. Weisberg, E., Manley, P.W., Cowan-Jacob, S. W., Hochhaus, A. & Griffin, J.D. Second 
generation inhibitors of BCR-ABL for the treatment of imatinib-resistant chronic 
myeloid leukaemia. Nature Rev. Cancer 7, 345-356 (2007). 
10. Filippakopoulos, P. et a/. Selective inhibition of BET bromodomains. Nature 468, 
1067-1073 (2010). 
1. Holohan, C., Van Schaeybroeck, S., Longley, D. B. & Johnston, P. G. Cancer drug 
resistance: an evolving paradigm. Nature Rev. Cancer 13, 714-726 (2013). 
2. Krivtsov, A. V. etal. Transformation from committed progenitor to leukaemia stem 
cell initiated by MLL-AF9. Nature 442, 818-822 (2006). 
13. Somervaille, T.C. & Cleary, M. L. Identification and characterization of leukemia stem 
cells in murine MLL-AF9 acute myeloid leukemia. Cancer Cell 10, 257-268 (2006). 
14. Wang, Y. et al. The Wnt/B-catenin pathway is required for the development of 
leukemia stem cells in AML. Science 327, 1650-1653 (2010). 
15. Krivtsov, A. V. et al. Cell of origin determines clinically relevant subtypes of MLL- 
rearranged AML. Leukemia 27, 852-860 (2013). 
16. Valent, P. et al. Cancer stem cell definitions and terminology: the devil is in the 
details. Nature Rev. Cancer 12, 767-775 (2012). 
17. Eppert, K. et al. Stem cell gene expression programs influence clinical outcome in 
human leukemia. Nature Med. 17, 1086-1093 (2011). 
8. Goardon, N. et al. Coexistence of LMPP-like and GMP-like leukemia stem cells in 
acute myeloid leukemia. Cancer Cell 19, 138-152 (2011). 
19. Andersson, A. K. et al. The landscape of somatic mutations in infant MLL- 
rearranged acute lymphoblastic leukemias. Nature Genet. 47, 330-337 (2015). 
20. Lovén, J. et al. Selective inhibition of tumor oncogenes by disruption of super- 
enhancers. Cel! 153, 320-334 (2013). 
21. Yeung, J. et al. B-Catenin mediates the establishment and drug resistance of MLL 
leukemic stem cells. Cancer Cell 18, 606-618 (2010). 
22. Jamieson, C. H. etal. Granulocyte-macrophage progenitors as candidate leukemic 
stem cells in blast-crisis CML. N. Engl. J. Med. 351, 657-667 (2004). 
23. Thorne, C. A. etal. Small-molecule inhibition of Wnt signaling through activation of 
casein kinase la. Nature Chem. Biol. 6, 829-836 (2010). 
24. Knoechel, B. et al. An epigenetic mechanism of resistance to targeted therapy in 
T cell acute lymphoblastic leukemia. Nature Genet. 46, 364-370 (2014). 
25. Rathert, P. et a/. Transcriptional plasticity promotes primary and acquired 
resistance to BET inhibition. Nature http://dx.doi.org/10.1038/nature14898 
(2015). 


Supplementary Information is available in the online version of the paper. 


Acknowledgements We thank A. Bannister for critical reading of the manuscript. The 
Leukaemia Foundation Australia, Haematology Society of Australia and New Zealand, 
Royal Australasian College of Physicians and the Victorian Comprehensive Cancer 
Centre have supported CYF with PhD scholarships. M.A.D. is a Senior Leukaemia 
Foundation Australia Fellow, VESKI Innovation Fellow and Herman Clinical Fellow. The 
National Health and Medical Research Council of Australia (1085015; 1066545) and 
Leukaemia Foundation Australia fund the Dawson laboratory. 


Author Contributions C.Y.F. and M.A.D. designed the research, interpreted data and 
wrote the manuscript. C.Y.F., 0.G., E.Y.N.L. AF.R., S.F., D.T., K.S., D.S.,, P.Y., J.M.,G.G., D.L., 
R.G., A.T.P. and M.A.D. performed experiments and/or analysed data. E.L, A.F.R., PJ., 
RGR, S.C.-W.L, C.C., S.W.L, O.A-W,, T.K, R.WJ., S.J.D., BJ.P.H., R.K.P. and A.T.P. 
provided critical reagents, interpreted data and aided in manuscript preparation. 


Author Information The data discussed in this publication have been deposited in the 
NCBI Gene Expression Omnibus (GEO) under accession number GSE63683. Reprints 
and permissions information is available at www.nature.com/reprints. The authors 
declare competing financial interests: details are available in the online version of the 
paper. Readers are welcome to comment on the online version of the paper. 
Correspondence and requests for materials should be addressed to M.A.D. 
(mark.dawson@petermac.org). 


©2015 Macmillan Publishers Limited. All rights reserved 


METHODS 


Generation of immortalized primary mouse HSPC lines and derivation of 
clonal cell lines. Initial generation of immortalized parental cell lines was achieved 
through magnetic bead selection (Miltenyi Biotec) of c-kit positive cells, obtained 
from whole bone marrow of male and female C57BL/6 mice, and subsequent 
retroviral transduction with either an MSCV-MLL-AF9-IRES-YFP or an 
MSCV-MLL-ENL construct. 

To generate clonal resistant cell lines, the MLL-AF9-bearing parental cell line 
was serially re-plated in cytokine-supplemented methylcellulose (Methocult 
M3434, StemCell Technologies) containing either vehicle (0.1% DMSO) or drug 
(400 nM I-BET151). Individual vehicle-treated or resistant colonies were picked 
and transferred to liquid culture to generate clonal cell lines. Resistant cell lines 
were maintained continuously in drug while being incrementally exposed to 
increasing concentrations of drug (up to 1 1M I-BET151). Vehicle treated clones 
were also continuously maintained in 0.1% DMSO and passaged in identical 
fashion. The parental cell line was continuously maintained with no exposure to 
vehicle or drug. 

Similarly, to generate resistant cell lines, the MLL-ENL-bearing parental cell 
line was serially re-plated in cytokine-supplemented methylcellulose containing 
either vehicle (0.1% DMSO) or drug (400 nM I-BET151). Cells growing in each 
plate were then washed and transferred to liquid culture to generate cell lines. 
Resistant cell lines were maintained continuously in drug while being incremen- 
tally exposed to increasing concentrations of drug (up to 14M I-BET151). 
Vehicle-treated clones were also continuously maintained in 0.1% DMSO and 
passaged in identical fashion. The parental cell line was continuously maintained 
with no exposure to vehicle or drug. 

Cell culture. Primary mouse haematopoietic progenitors and derived cell lines 
were grown in RPMI-1640 supplemented with mouse IL-3 (10ngml’), 
20% FCS, penicillin (100 U ml~’), streptomycin (100 pg ml~'), amphotericin B 
(250 ng ml!) and gentamycin (50 pg ml). Cell lines were routinely tested for 
mycoplasma contamination by PCR. Primary human leukaemia cells were 
grown in the presence of IL3 (10 ng ml '), IL6 (10 ng ml!) and SCF (50 ng ml). 
Cells were incubated at 37°C and 5% CO3. 

Cell proliferation assays. For dose-response assays, serial dilutions of I-BET151, 
JQ1 or pyrvinium were further diluted in media before addition to 96-well plates 
seeded with between 5 X 10° and 1 X 10° cells per well to obtain a 0.1% DMSO 
final concentration. After 72h incubation, resazurin was added to each well 
and plates were further incubated for 3h. Fluorescence was then read at 
560 nm/590 nm on a Cytation 3 Imaging Reader (BioTek). Cell counts were 
performed using a haemocytometer. Determination of in vitro synergy in prolif- 
eration assays was undertaken according to the method described previously”®. 
Clonogenic assays in methylcellulose. Clonogenic potential was assessed 
through colony growth of derived cell lines plated in cytokine-supplemented 
methylcellulose (Methocult M3434, StemCell Technologies). Derived vehicle- 
treated and resistant cell lines were plated in duplicate at a cell dose of 2 X 10? 
per plate in the presence of vehicle (0.1% DMSO) or drug (11M I-BET151). 
Grl-/CD11b™ and Gr1*/CD11b* fractions of resistant cell lines were plated in 
duplicate following FACS sorting ata cell dose of between 2 X 10° and 2 X 10 cells 
per plate. FACS-isolated L-GMP populations from whole mouse bone marrow 
following primary syngeneic transplant of vehicle-treated clones were plated in 
duplicate at a cell dose of between 2 X 10° and 2 X 10° cells per plate in the 
presence of vehicle (0.1% DMSO) or drug (1 1M I-BET151). Cells were incubated 
at 37 °C and 5% CO, for 7-10 days at which time colonies were counted. 

Flow cytometric analyses. Cell apoptosis was assessed using APC conjugated 
Annexin V (550475, BD Biosciences) and propidium iodide (PI) (P4864, 
Sigma-Aldrich) staining according to manufacturer’s instructions. 

For cell cycle analysis, cells were fixed overnight at —20 °C in 70% ethanol/PBS. 
Before flow cytometry analysis, cells were incubated at 37°C for 30 min in PI 
staining solution (0.02 mg ml ' PI, 0.05% (v/v) Triton X-100 in PBS, supplemen- 
ted with DNase-free RNase A (19101, Qiagen)) or incubated at room temperature 
for 10 min with 4’,6-diamidino-2-phenylindole (DAPI) staining solution 
(1 pg ml~! DAPI, 0.05% (v/v) Triton X-100 in PBS). 

Immunophenotype assessment for markers of committed differentiation was 
undertaken through staining with Alexa Fluor 700 anti-Grl (108422, BioLegend) 
and Brilliant Violet 605 anti-CD11b (101237, BioLegend). Assessment of L-GMP 
populations was undertaken through staining with eFluor 660 anti-CD34 (50-0341- 
82, eBioscience), biotin lineage antibody cocktail (120-001-547, Miltenyi Biotec), 
PerCP/Cy5.5 anti-CD16/32 (101324, BioLegend), APC/Cy7 anti-CD117 (105826, 
BioLegend) and Pacific Blue anti-Ly-6A (122520, BioLegend) followed by second- 
ary staining with V500 streptavidin (561419, BD Biosciences). Assessment of leuk- 
aemic LMPP and GMP populations in patient-derived xenografts was undertaken 
through staining with APC/Cy7 anti-mouse CD45.1 (110716, Biolegend), eFluor 
450 anti-mouse Terl19 (48-5921-82, eBioscience), FITC anti-human CD45 
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(11-9459-42, eBioscience), BV711 anti-human CD38 (563965, BD Biosciences), 
PE anti-human CD90 (561970, BD Biosciences), PE-Cy5 anti-human CD123 
(15-1239-41, eBioscience), PerCP-Cy5.5 anti-human CD45RA (45-0458-42), 
biotin anti-human CD3 (555338, BD Biosciences), biotin anti-human CD19 
(555411, BD Biosciences), PE-Cy7 anti-human CD33 (333946, BD Biosciences) 
and APC anti-human CD34 (555824, BD Biosciences) followed by secondary 
staining with V500 streptavidin (561419, BD Biosciences). 

PI or DAPI was used as a viability dye to ensure that immunophenotyping 
analyses were performed on viable cells. Appropriate unstained, single-stained 
and fluorescence minus one controls were used to determine background staining 
and compensation in each channel. 

Flow cytometry analyses were performed on a LSRFortessa X-20 flow cytometer 
(BD Biosciences) and all data analysed with FlowJo software (vX.0.7, Tree Star). 
Cell sorting was performed on a FACSAria Fusion flow sorter (BD Biosciences). 
RNA interference studies. shRNAs were cloned into TtRMPVIR (27995, 
addgene). For competitive proliferation assays, transduced cells were sorted for 
shRNA-containing (Venus*/YFP*) and non-shRNA-containing (YFP* only) 
populations and recombined at a 1:1 ratio. After this, cells were cultured with 
1 mg ml’ doxycycline to induce shRNA expression. The proportion of shRNA- 
expressing (dsRED*/Venus*/YFP*) cells were determined by flow cytometric 
analysis and followed over time. Knockdown efficiency of shRNA-expressing 
and non-shRNA-containing cells was assessed after 48-72 h of doxycycline expo- 
sure by quantitative reverse transcriptase PCR (qRT-PCR) and immunoblotting. 

The following shRNA sequences were used: Brd2 (#851), 5’-CGGATTATCA 
CAAAATTAT-3’'; Brd4 (#498), 5'-ACTATGTTTACAAATTGTT-3'; Brd3/4 
(#499), 5'-AGGACTTCAACACTATGTT-3’; Brd4 (#500), 5'-AGCAGAACAA 
ACCAAAGAA-3’, 

shRNAs directed against Apc were cloned into LMN-mirE-mCherry. The pro- 
portion of shRNA-expressing (mCherry~) cells was determined by flow cyto- 
metric analysis following treatment with vehicle (0.1% DMSO) or I-BET151 
and followed over time. Selective advantage consequent to shRNA expression 
results in enrichment of mCherry* cells. Knockdown efficiency of Ape in 
shRNA-expressing cells was assessed following FACS of mCherry* cells. 
shRNAs directed against Apc were a gift from J. Zuber, the detailed validation 
of which can be found in ref. 25. 
qRT-PCR. mRNA was prepared using the Qiagen RNeasy kit and cDNA syn- 
thesis was performed using SuperScript VILO kit (Life Technologies) as per man- 
ufacturer’s instructions. qPCR analysis was undertaken on an Applied Biosystems 
StepOnePlus System with SYBR green reagents (Life Technologies). 

For analysis of mouse cell line samples, expression levels were determined using 
the AC; method and normalized to B-2-microglobulin (B2m) and/or Gapdh. 
Differences in expression were assessed using a one-sided t-test for statistical 
significance. Assessment of expression changes associated with I-BET151 treat- 
ment occurred at 6 h after treatment with 1 uM I-BET151. 

The following mouse primer pairs were used: Apc, forward 5'-GGAGTGGC 
AGAAAGCAACAC-3’, reverse 5'- AAACACTGGCTGTTTCGTGA-3’; B2m, 
forward 5'-GAGCCCAAGACCGTCTACTG-3’, reverse 5'-GCTATTTCTTTC 
TGCGTGCAT-3’; Brd2, forward 5'-TGGGCTGCCTCAGAATGTAT-3’, reverse 
5'-CCAGTGTCTGTGCCATTAGG-3’; Brd3, forward 5'-GCCAGTGAGTGTA 
TGCAGGA-3’, reverse 5’-GCCTGGGCCATTAGCACTAT-3’; Brd4, forward 
5'-TCTGCACGACTACTGTGACA-3’, reverse 5’-GGCATCTCTGTACTCTC 
GGG-3'; Ccnd2, forward 5'-CAAGCCACCACCCCTACA-3’, reverse 5’-TTGC 
CGCCCGAATGG-3'; Dkk1, forward 5'-CTGCATGAGGCACGCTATGT-3’, 
reverse 5'-AGGAAAATGGCTGTGGTCAG-3’; Dvll, forward 5'-ATCACAC 
GCACCAGCTCTTC-3’, reverse 5’-GGACAATGGCACTCATGTCA-3’; Fzd5, 
forward 5'-GGCTACAACCTGACGCACAT-3’, reverse 5’-CAGAATTGGTG 
CACCTCCAG-3'; Gapdh, forward 5'-GGTGCTGAGTATGTCGTGGA-3’, 
reverse 5'-CGGAGATGATGACCCTTTTG-3’; Gsk3b, forward 5'-TTGGAGC 
CACTGATTACACG-3’, reverse 5’-CCAACTGATCCACACCACTG-3’; Myc, 
forward 5'-TGAGCCCCTAGTGCTGCAT-3’, reverse 5'-AGCCCGACTCCGA 
CCTCTT-3’. 

For determination of baseline WNT/f-catenin pathway and target gene 
expression in primary human AML samples, expression relative to the mean of 
all samples was determined using the AC; method and normalized to GAPDH 
and actin. The following human primers were used: AXIN2, forward 5'-CGGA 
CAGCAGTGTAGATGGA-3’, reverse 5’-CTTCACACTGCGATGCATTT-3’; 
CCND1, forward 5'-GCTGTGCATCTACACCGACA-3’, reverse 5’-CCACTT 
GAGCTTGTTCACCA-3’; CTNNB1, forward 5'-GACCACAAGCAGAGTGC 
TGA-3’, reverse 5'-CTTGCATTCCACCAGCTTCT-3’; FZD5, forward 5'-TTC 
CTGTCAGCCTGCTACCT-3’, reverse 5'-CGTAGTGGATGTGGTTGTGC-3’; 
MYC, forward 5’-CTGGTGCTCCATGAGGAGA-3’, reverse 5'-CCTGCCTC 
TTTTCCACAGAA-3’; TCF4, 5'-ATGGCAAATAGAGGAAGCGG-3’, reverse 
5'-TGGAGAATAGATCGAAGCAAG-3'; ACTB, forward 5’-TTCAACACCC 
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CAGCCATGT-3’, reverse 5'-GCCAGTGGTACGGCCAGA-3’; GAPDH, 5’-AC 
GGGAAGCTTGTCATCAAT-3’, reverse 5'-TGGACTCCACGACGTACTCA-3’. 
Immunoblotting. Whole-cell lysates were mixed with Laemmli SDS sample buf- 
fer, separated via SDS-PAGE and transferred to PVDF membranes (Millipore). 
Membranes were then sequentially incubated with primary antibodies (see anti- 
bodies) and secondary antibodies conjugated with horseradish peroxidase 
(Invitrogen). Membranes were then incubated with ECL (GE Healthcare) and 
proteins detected by exposure to X-ray film. 

Mouse tissue sample preparation. Peripheral blood samples were collected in 
EDTA-treated tubes (Sarstedt) and counted using a XP-100 analyser (Sysmex). 
Single-cell cytospins and blood smears were stained with the Rapid Romanowsky 
Staining Kit (Thermo Fisher Scientific). Bone marrow cells were isolated by flush- 
ing both femurs and tibias with cold PBS. Before flow cytometry, red blood cells 
were lysed in red blood cell lysis buffer (Sigma). 

Examination of drug efflux and metabolism by quantitative mass spectro- 
metry. Between 2 X 10° and 3 X 10° cells per well were seeded in 24-well plates 
and treated with vehicle (0.1% DMSO) or 600 nM I-BET151. After 48 h, cells were 
collected by centrifugation, washed twice in ice-cold PBS and lysed in M-PER 
buffer (78501, Thermo Scientific). Base media, supernatant, wash and cell lysates 
were quenched with 5% acetonitrile (aq) containing labetalol at 62.5 ng ml‘ as the 
internal standard. These samples, in addition to serial dilutions of I-BET151 used 
to generate standard curves, were then analysed by mass spectrometry. 

HPLC-mass spectrometry apparatus and conditions: The HPLC system was an 
integrated CTC PAL auto sampler (LEAP technologies), Jasco XTC pumps 
(Jasco). The HPLC analytical column was an ACE 2 C18 30mm X 2.1mm 
(Advanced Chromatography Technologies) maintained at 40°C. The mobile 
phase solvents were water containing 0.1% formic acid and acetonitrile containing 
0.1% formic acid. A gradient ran from 5% to 95% ACN plus 0.1% formic acid up to 
1.3 min, held for 0.1 min and returning to the starting conditions over 0.05 min 
then held to 1.5 min ata flow rate of 1 ml min’. A divert valve was used so the first 
0.4 min and final 0.2 min of flow were diverted to waste. 

Mass spectromic detection was by an API 4000 triple quadrupole instrument 
(AB Sciex) using multiple reaction monitoring (MRM). Ions were generated in 
positive ionization mode using an electrospray interface. The ionspray voltage was 
set at 4,000 V and the source temperature was set at 600°C. For collision dissoci- 
ation, nitrogen was used as the collision gas. The MRM of the mass transitions for 
I-BET151 (m/z 416.17 to 311.10), and labetalol (m/z 329.19 to 162.00), were used 
for data acquisition. 

Data were collected and analysed using Analyst 1.4.2 (AB Sciex), for quantifica- 

tion, area ratios (between analyte/internal standard) were used to construct a 
standard line, using weighted (1/x) linear least squared regression, and results 
extrapolated the area ratio of samples from this standard line. 
Mouse models of leukaemia. Primary syngeneic transplantation studies of 
stably growing derived vehicle treated or resistant cell lines in limit dilution ana- 
lyses were performed with intravenous injection of between 1 X 10' to 2 X 10° 
cells per mouse. 

Serial syngeneic transplantation studies of drug efficacy, generation of in vivo 
resistance and limit dilution analyses were performed with intravenous injection 
of between 1 X 10! to 2.5 X 10° cells per mouse obtained from bone marrow or 
spleen. Treatment with vehicle or I-BET151 at 20-30 mg kg” ' began between days 
9 and 13. Pyrvinium, alone or in combination with I-BET151, was delivered 
between days 9 and 26. 

After stable retroviral transduction of resistant cell lines with a Dkk1 containing 
construct, 5 X 10° cells per mouse were injected intravenously in primary syn- 
geneic transplants. Treatment with vehicle or I-BET151 at 20mg kg! began at 
day 16. 

Syngeneic transplantation studies were performed in C57BL/6 mice (wild-type 
or expressing Ptprc*). All mice were 6-10 weeks old at the time of sub-lethal 
irradiation (300 cGy) and intravenous cell injection. Treatment with vehicle, 
I-BET151 or pyrvinium commenced after engraftment of leukaemia as deter- 
mined by >1% yellow fluorescent protein (YFP) expression in peripheral blood 
in most mice. Mice were randomly assigned treatment groups; treatment admin- 
istration was not blinded. Sample sizes were determined according to the resource 
equation method. Differences in Kaplan-Meier survival curves were analysed 
using the log-rank statistic. 

Patient derived xenograft studies were performed in NOD/SCID/Ilarg ‘~ 
(NSG) mice. All mice were 6-10 weeks old at the time of sub-lethal irradiation 
(200 cGy) and intravenous cell injection of 1 X 10° to 5 X 10° cells per mouse. 
Treatment with vehicle or I-BET151 at 10 mg kg ' for a 2-week period began after 
detection of >1% circulating human CD45" cells in mouse peripheral blood at 
week 14. Treatment cohorts were matched for transplant generation. 

I-BET151 was dissolved in normal saline containing 5% (v/v) DMSO and 10% 
(w/v) Kleptose HPB. I-BET151 was delivered daily (5 days on, 2 days off) by 


intraperitoneal injection (10 ml kg~') with dose reduction of I-BET151 under- 
taken if evidence of drug intolerance was present. Pyrvinium was dissolved in 
normal saline containing 15% (v/v) DMSO and delivered daily by intraperitoneal 
injection (10 ml kg~'). Dosing of pyrvinium started at 0.1 mg kg”! and escalated 
in 0.1 mg kg” ' increments every second dose to a maximal dose of 0.5mg kg '. 
All mice were kept in a pathogen-free animal facility, inspected daily and 
euthanized on signs of distress/disease. All experiments were conducted under 
either UK Home Office regulations or Institutional Animal Ethics Review Board in 
Australia. Statistical analyses of limit dilutions were undertaken according to the 
method described previously’. 
Exome capture sequencing. DNA was extracted from cell lines using the DNeasy 
blood and tissue kit (Qiagen), and quantified using the Qubit dsDNA HS 
Assay (Life Technologies) before fragmentation to a peak size of approximately 
200 base pairs (bp) using the focal acoustic device, SonoLab S2 (Covaris). Library 
preparations were performed using the SureSelect™’ Target Enrichment System for 
Illumina Paired-End Sequencing Library protocol (Agilent Technologies) with the 
SureSelect*’ Mouse All Exon Kit for the capture process (Agilent Technologies). 
The quality of libraries submitted for sequencing was assessed using the High 
Sensitivity DNA assay on the 2100 bioanalyzer (Agilent Technologies). Libraries 
were quantified with qPCR, normalized and pooled to 2 nM before sequencing with 
paired end 100-bp reads using standard protocols on the HiSeq2500 (Illumina). 
The Fastq files generated by sequencing were aligned to the mm10 mouse 
reference genome using bwa”’. Copy number variation was analysed using 
ADTEx” to compare the depth of coverage in resistant and vehicle treated clones 
with the parental cell line. Variant calling was performed with VarScanz2 (ref. 30), 
MuTect*' and GATK HaplotypeCaller*. The Ensembl Variant Effect Predictor 
(VEP)”* was used to predict the functional effect of the identified variants. 
Mutations detected by at least two variant callers were further analysed for 
shared mutations between cell lines and mutation spectrum. Genomic regions 
with coverage of at least eight reads in all libraries were analysed for the frequency 
of mutations. Coding exonic, untranslated regions and intronic regions were 
obtained from the UCSC Table Browser™’. Upstream regions were defined as 
1,000 bp upstream of genes, downstream regions were defined as 1,000 bp down- 
stream of genes, and intergenic regions were more than 1,000 bp from genes. 
ChIP, qPCR and sequencing analysis. Cells were cross-linked with 1% form- 
aldehyde for 15 min at room temperature and cross-linking stopped by the addi- 
tion of 0.125 M glycine. Cells were then lysed in 1% SDS, 10 mM EDTA, 50 mM 
Tris-HCl, pH 8.0, and protease inhibitors. Lysates were sonicated in a Covaris 
ultrasonicator to achieve a mean DNA fragment size of 500 bp. Immuno- 
precipitation (see antibodies) was performed for a minimum of 12 h at 4°C in 
modified RIPA buffer (1% Triton X-100, 0.1% deoxycholate, 90 mM NaCl, 10 mM 
Tris-HCl, pH 8.0 and protease inhibitors). An equal volume of protein A and G 
magnetic beads (Life Technologies) were used to bind the antibody and associated 
chromatin. Reverse crosslinking of DNA was followed by DNA purification using 
QlAquick PCR purification kits (Qiagen). Immunoprecipitated DNA was ana- 
lysed on an Applied Biosystems StepOnePlus System with SYBR green reagents. 
The following primer pairs were used in the analysis: Myc TSS, forward 5'-GTC 
ACCTTTACCCCGACTCA-3’, reverse 5’-TCCAGGCACATCTCAGTTTG-3’; 
Myc enhancer, forward 5'-TCTTTGATGGGCTCAATGGT-3’, reverse 5'-TTC 
CCTTCACCTGATGAACC-3’. For sequencing analysis of immunoprecipitated 
DNA, DNA was quantified using the Qubit dsDNA HS Assay (Life Technologies). 
Library preparations were performed using the standard ThruPLEXTM-FD Prep 
Kit protocol (Rubicon Genomics) and size selected for 200-400 bp using the 
Pippen Prep (Sage Science Inc.). Fragment sizes were established using either 
the High Sensitivity DNA assay or the DNA 1000 kit and 2100 bioanalyzer 
(Agilent Technologies). Libraries were quantified with qPCR, normalized and 
pooled to 2 nM before sequencing with single-end 50-bp reads using standard 
protocols on the HiSeq2500 (Illumina). The Fastq files generated by sequencing 
were aligned to the mm10 mouse reference genome using bwa”. Peak-calling was 
performed using MACS2 (ref. 35) with default parameters and the input library as 
control. Profiles and heat maps of reads and MACS peaks in the 5 kb around the 
TSS were generated with Genomic Tools”*. 
Expression analysis by microarray and RNA-sequencing. RNA was prepared 
using the Qiagen RNeasy kit. For microarray analysis, RNA was hybridized to 
Illumina MouseWG-6 v2 Expression BeadChips. Gene expression data were pro- 
cessed using the lumi package in R. Probe sets were filtered to remove those where 
the detection P value (representing the probability that the expression is above the 
background of the negative control probe) was greater than 0.05 in at least one 
sample. Expression data was background corrected and quantile normalized. 
Normalization and inference of differential expression were performed using 
limma’’. Correction for multiple testing was performed using the method of 
Benjamini and Hochberg”*. Genes with an FDR rate below 0.05 and a fold-change 
greater than 2 were considered significantly differentially expressed. For genes 
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with multiple probe sets, only the probe set with the highest average expression 
across samples was used. 

For RNA sequencing analysis, RNA concentration was quantified with the 
NanoDrop spectrophotometer (Thermo Scientific). The integrity was established 
using the RNA 6000 kit and 2100 bioanalyzer (Agilent Technologies). Library 
preparations were performed using the standard TruSeq RNA Sample 
Preparation protocol (Illumina) with fragment sizes established using the DNA 
1000 kit and 2100 bioanalyzer (Agilent Technologies). Libraries were quantified 
with qPCR, normalized and pooled to 2 nM before sequencing with paired-end 
50 bp reads using standard protocols on an Illumina HiSeq2500. 

Reads were aligned to the mouse genome (Ensembl Release 75, Feb 2014) 
using Subread” and assigned to genes using featureCounts”. Differential express- 
ion was inferred using limma/voom’’. Correction for multiple testing using the 
Benjamini-Hochberg method was performed. Genes with an FDR below 0.05 and 
a fold-change greater than 2 were considered significantly differentially expressed. 

Gene set enrichments were determined using ROAST”. ROAST tests for up- or 
downregulation of genes in a given pathway (Fig. 3i) were performed on cell lines 
either stably maintained in vehicle or I-BET. P values were corrected for multiple 
testing using the method of Benjamini and Hochberg. Gene sets were obtained 
from MSigDB” and curated. Human Entrez accessions from the downloaded gene 
sets were converted into mouse accessions using orthologue information from the 
Mouse Genome Database (MGD) at the Mouse Genome Informatics website 
(http://www.informatics.jax.org; accessed June 2014). ROAST tests were per- 
formed to assess for an enrichment of a LGMP gene expression signature 
(GSE4416)” and a LGMP derived from HSC signature (GSE18483)'° in the 
I-BET resistant compared with vehicle cell lines. The gene expression program 
associated with human leukaemia stem cells was obtained from GSE30375 (ref. 17) 
and analysed with LIMMA”’. Gene expression of LSC was compared with LPC and 
genes upregulated in LSC were analysed for an enrichment of the Wnt/B-catenin 
pathway using ROAST. 

GSEA terms. The following GSEA terms were used. WNT/B-catenin: ST_WNT_ 
BETA_CATENIN_PATHWAY; JAK/STAT: KEGG_JAK_STAT_SIGNALING_ 
PATHWAY; PI(3)K/AKT/mTOR: REACTOME_PI3K_AKT_ACTIVATION; 
NF-«B: REACTOME_ACTIVATION_OF_NF_KAPPAB_IN_B_ CELLS; RAS/ 
ERK/MAPK: KEGG_MAPK_ SIGNALING PATHWAY; NOTCH: KEGG_ 
NOTCH_SIGNALING_PATHWAY; hippo: REACTOME_SIGNALING_BY_ 
HIPPO; hedgehog: KEGG_HEDGEHOG_SIGNALING_PATHWAY; TGF-B: 
KEGG_TGF_BETA_SIGNALING_PATHWAY. 

Antibodies. The following antibodies were used in ChIP and immunoblotting 
assays: anti-Brd2 (A302-583A, Bethyl Labs), anti-Brd3 (A302-368A, Bethyl Labs), 
anti-Brd4 (A301-985A, Bethyl Labs and ab128874, abcam), anti-H3K27ac 
(ab4729, abcam), anti-B-catenin (610154, BD Biosciences), anti-c-Myc (9402S, 
Cell Signalling Technology), anti-B-actin (A1978, Sigma-Aldrich) and anti- 
Hsp60 (sc-13966, Santa Cruz Biotechnology). 

Correlation of expression of WNT/f-catenin pathway expression and res- 
ponse to I-BET151. A principal component analysis was performed on the 
qRT-PCR data of B-catenin pathway and target genes from primary human AML 
samples. Pearson’s correlation was calculated between the expression of the pathway 
genes in the first principal component, and the responsiveness to I-BET151. 
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Correlation of log gene expression of selected WNT/B-catenin pathway genes 
was assessed using a corrgram and correlation between log expression and apop- 
tosis was examined using scatterplots. As expression between genes was typically 
highly correlated, or inversely correlated, the log-expression data was summarized 
using the first principle component and compared to the level of apoptosis. 
A multiple linear regression model was also fitted to the data. As the full model 
was close to saturated (8 samples, 6 genes), a stepwise model selection procedure 
based on the Akaike Information Criteria (AIC), which was implemented in 
the R function STEP, was used. The model that minimized the AIC excluded 
one gene (AXIN2). 

Patient material. Peripheral blood or bone marrow containing >80% blasts was 
obtained from patients following consent and under full ethical approval at each 
involved institute. 
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Extended Data Figure 1 | Establishment of a model of BET inhibitor 
resistance. a, Resistance to I-BET demonstrated in cell proliferation assays. 
Representative dose-response curve of a vehicle treated clone and a resistant 
clone stably maintained in 1,000 nM I-BET after 72 h of growth (mean = s.d., 
n= 4 per group). b, Cross-resistance to chemically distinct BET inhibitors 
demonstrated in cell proliferation assays. Representative dose-response curve 
ofa vehicle-treated clone and resistant clone after 72 h of growth (mean = s.d., 
n= A per group). c, Abrogation of response to I-BET-mediated cell cycle 
arrest. This is more evident in resistant clones stably maintained in higher 
concentrations of I-BET. Data from biological duplicate experiments 

(mean + s.e.m.). d, Representative flow plots of cell lines stably transduced with 
an inducible shRNA vector. Vector-positive cells constitutively express Venus. 
After the introduction of doxycycline, shRNA-expressing cells co-express 
Venus and dsRED. Selective disadvantage consequent to shRNA expression 
results in drop out of dsRED-positive cells from culture over time, which is 
assessed by flow cytometry. e, Independent inducible shRNAs specifically 
reduce the expression of Brd4, but not Brd2 or Brd3, after 48-72 h of 
doxycycline. Messenger RNA levels in shRNA-positive cells normalized to 


Lineage 


mRNA expression in shRNA-negative cells in biological duplicate experiments 
(mean + s.e.m.). f, Brd4 protein levels are reduced in shRNA-positive cells. 
Uncropped blots are shown in Supplementary Fig. 1. g, In addition to resistance 
to selective knockdown of Brd4, BET-inhibitor-resistant cells are also refractory 
to RNAi-mediated dual knockdown of Brd3 and Brd4. shRNA-mediated 
knockdown of Brd2 has minimal effect on both vehicle-treated and resistant 
clones. dsRED-positive cells normalized to day 1 after doxycycline exposure in 
biological duplicate experiments (mean = s.e.m.). h, Reduction of Brd3/4 
mRNA expression and Brd2 mRNA expression with two independent shRNAs 
after 48-72 h of doxycycline. mRNA levels in shRNA-positive cells normalized 
to mRNA expression in shRNA-negative cells in biological duplicate 
experiments (mean = s.e.m.). i, Examination of vehicle-treated and resistant 
clones demonstrates no major morphological differences. j, Resistant clones 
are smaller and demonstrate homogeneity in size and complexity (FSC™!*/ 
SSC) by flow cytometry. k, Resistant clones are enriched for L-GMPs 
(Lin™, Sca~, cKit*, CD34, FcyRII/RIII*). Representative FACS analysis 

of vehicle-treated and resistant clones, percentages represent proportion of 
parent gate. 
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Extended Data Figure 2 | Resistance to BET inhibitors also arises from an 
immature cell compartment in MLL-ENL leukaemia. a, Strategy for the 
generation of resistant cell lines from primary HSPC after retroviral 
transduction with the oncogene MLL-ENL. The parental cell line was serially 
re-plated in cytokine supplemented semi-solid media containing either vehicle 
(0.1% DMSO) or 400 nM I-BET (~IC,p of parental cell line). Cells in each plate 
were then washed and transferred to liquid culture to generate cell lines. 
Resistant cell lines were subsequently exposed to increasing selection pressure 


Gsk3B 


Gi Vehicle (MLL-ENL) 
i Resistant (MLL-ENL) 


B Vehicle 
GB Resistant 


Relative Expression 


Fzd5 
p<0.001 p=0.006 

in liquid culture. Vehicle-treated cell lines and the parental cell line were 
identically passaged. b, Resistant cell lines bearing MLL-ENL are smaller and 
demonstrate homogeneity in size and complexity (FSC™“/SSC°”) in addition 
to exhibiting an immature immunophenotype (Grl /CD11b ). 
Representative FACS analysis of vehicle-treated and resistant cell lines. 

c, Resistant cell lines bearing MLL-ENL demonstrate increased expression of 
WNT/f-catenin pathway genes. RT-PCR data performed in biological 
triplicate (mean + s.d.). 
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Extended Data Figure 3 | Resistance to BET inhibitors emerges from 
leukaemia stem cells. a, Transplant cohorts and survival of mice injected 
with vehicle-treated and resistant clones in limit dilution analyses of primary 
syngeneic transplants displayed in Fig. 2d, e. b, Experimental strategy for 
derivation of in vivo resistance to BET inhibitors in a MLL-AF9 leukaemia 
model. After primary transplant of a vehicle-treated clone, serial transplant of 
either I-BET-exposed or I-BET-naive leukaemias, derived from whole bone 
marrow of diseased mice, was undertaken until loss of | BET-mediated survival 
advantage was observed. Treatment was started on days 11-13. c, Progressive 
loss of I-BET-mediated survival advantage observed in serial transplant 
generations. Kaplan—Meier curves of serial transplant generations. Second 


transplant: I-BET naive n = 6, I-BET exposed n = 6. Third transplant: I-BET 
naive n = 2, I-BET exposed n = 3. Fourth transplant: I-BET naive n = 2, I-BET 
exposed n = 3. Fifth transplant: I-BET naive n = 4, I-BET exposed n = 5. 

d, Limit dilution analyses of leukaemias derived from bone marrow of diseased 
mice chronically exposed to I-BET after the fourth transplant demonstrates 
that less than 10 cells are reliably able to transfer leukaemia. Kaplan-Meier 
curves of C57BL/6 mice injected with indicated number of cells. e, Chronic 
I-BET exposure significantly enriches for leukaemia stem cells in vivo. 

f, Transplant cohorts and survival of limit dilution analyses of data displayed in 
dande. g, Gating strategy for identification of L-GMPs in whole mouse bone 
marrow. 
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Extended Data Figure 4 | Enrichment of a LMPP population in AMLPDX. hCD38/hCD123/hCD45RA/hCD90 denotes human CD45/CD3/CD19/CD33/ 
a, Experimental treatment strategy for treatment of NOD/SCID/Il2rg ‘~ CD34/CD38/CD123/CD45RA/CD90. d, Representative FACS analysis of bone 
(NSG) mice bearing AML PDXs. Treated mice (with either vehicle or I-BET) — marrow obtained from vehicle- and I-BET-treated mice demonstrating 
belonged to identical transplant generations. b, Cytogenetic and genetic enrichment of LMPP-like LSCs in I-BET-treated mice. Events displayed are 
information of PDX models used. c, Gating strategy for identification of gated on mTerl19" /hCD45*/hCD33* cells and are expressed as a percentage 


LMPP-like LSCs and GMP-like LSCs from mouse bone marrow. mTer119/ of total hCD45° cells. 
mCD45 denotes mouse Terl19/CD45; hCD45/hCD3/hCD19/hCD33/hCD34/ 
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Extended Data Figure 5 | Intrinsic resistance to BET inhibition is not a containing either vehicle (0.1% DMSO) or 1 1M I-BET. b, L-GMPs do not 
feature of L-GMPs. a, Experimental strategy for testing intrinsic resistance of | demonstrate intrinsic resistance to I-BET. Colony counts after 7 days of growth 


L-GMPs to BET inhibition. After syngeneic transplant of a vehicle-treated in biological triplicate (see also Fig. 2h) experiments (mean ~ s.e.m.) of FACS- 
clone, L-GMPs were FACS-isolated from whole mouse bone marrow of isolated L-GMPs after primary transplant of vehicle-treated clones in two 
diseased mice and cultured in cytokine supplemented semi-solid media additional independent mice. M2, mouse 2; M3, mouse 3. 
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Extended Data Figure 6 | Further genetic characterization of BET- 
inhibitor-resistant cells. a, Comparison of whole-exome sequencing (WES) 
data from early and late time points identifies non-advantageous passenger 
mutations. Data from WES of samples obtained at an earlier time point to that 
presented in Fig. 3d is shown. Call out box identifies genes within a small 
region on chromosome 13 in one resistant clone which demonstrate copy 
number gain and are associated with increased mRNA expression relative to 
non-resistant cells. b, Venn diagram demonstrating gene mutations shared 
between vehicle-treated and resistant clones. Highlighted in the call out box 
are 24 gene mutations shared between resistant clones but not found in 
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vehicle-treated clones. c, Resistant clones do not exhibit marked genetic 
instability with low mutation frequency observed. d, No specific mutation 
signature is identified in resistant clones. e, Correlation of genes identified in 
copy number gain region on chromosome 13 with gene expression data from 
the two resistant clones examined by WES. Fold change in gene expression 
compared to vehicle-treated clones obtained from microarray analysis is 
shown. f, Mutations detected by WES can be validated with data obtained from 
RNA sequencing (RNA-seq) of the same clones. Selected examples of 
mutations unique to resistant clones and shared between vehicle-treated and 
resistant clones is shown in integrative genomics viewer (IGV) tracks. 
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Extended Data Figure 7 | Further epigenetic and gene expression 
characterization of BET-inhibitor-resistant cells. a, BRD4 binding profile at 
Polr2a enhancer elements demonstrates no significant loss of BRD4 binding or 
H3K27ac levels in resistant clones. b, Genome wide profiling of BRD2 and 
BRD3 binding at TSSs comparing vehicle-treated and resistant clones is 
demonstrated in heat maps centred on the TSS of annotated genes with 5 kb 
flanking sequence either side. Red indicates higher density of reads in ChIP-seq 
data. c, Heat map of differential mRNA expression data from a vehicle- 
treated and resistant clone performed by RNA-seq in biological triplicate 
experiments. d, RNA-seq and microarray data are highly correlated. 
Correlation of log, fold change (logFC) between RNA-seq and microarray data 
across all genes. No genes show opposing expression changes. Dotted line 
indicates y = x, blue dots represent genes that are significantly differentially 
expressed (gene expression log(FC) at least +1.0, FDR corrected P< 0.05). 
e-g, GSEA shows enrichment of LSC signature in I-BET-resistant cells, with 
resistant clones stably maintained in progressively higher concentrations 

of I-BET demonstrating increased enrichment of differentially expressed genes 
associated with a L-GMP self-renewal program. Barcode plot compares 
differential expression of genes in vehicle-treated and resistant clones to 
published microarray data comparing L-GMPs and MLL-AF9 cells propagated 
in liquid culture. Shaded area in the centre of plot shows genes ranked by fold 
change in expression in resistant relative to vehicle clones. Pink and blue 
shading represent significantly up- and downregulated genes, respectively. 
Upregulated and downregulated genes in the previously published LSC gene 
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expression signature are shown in red and blue, respectively. Resistant (400) 
upregulated FDR = 1.2 X 10 ', downregulated FDR = 9.3 X 10° *. Resistant 
(600) upregulated FDR <1.0 X 10 *, downregulated FDR <2.5 X 107*. 
Resistant (800) upregulated FDR <5.0 X 10°, downregulated FDR 

<5.0 X 10 >. h, GSEA demonstrates that resistant clones show significant 
enrichment for genes associated with a self-renewal program identified from 
L-GMPs arising from haematopoietic stem cells (L-GMP HSCs). Upregulated 
FDR = 4.69 X 10”, downregulated FDR = 1.3 X 10° *. i, RNA-seq 

identifies enrichment of LSC gene expression signature following chronic 

in vivo BET inhibitor exposure. Heat map of differential mRNA expression 
data from RNA-seq of leukaemias from the bone marrow of I-BET-exposed 
(n = 2) and I-BET-naive (n = 2) mice after the fourth transplant. j, GSEA 

of RNA-seq data identifies enrichment of a previously published LSC gene 
expression signature in leukaemias chronically exposed to I-BET in vivo. 
Barcode plot compares differential expression of genes in I-BET-exposed and 
I-BET-naive leukaemias to published data comparing L-GMPs and MLL-AF9 
cells propagated in liquid culture. Shaded area in the centre of plot shows 
genes ranked by fold change in expression in I-BET-exposed relative to I-BET- 
naive leukaemias. Pink and blue shading represent significantly up- and 
downregulated genes, respectively. Upregulated and downregulated genes in 
the previously published LSC gene expression signature are shown in red and 
blue, respectively, and correlate with expression of genes in the I-BET- 
exposed leukaemias (FDR = 0.05). 
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Extended Data Figure 8 | Negative regulation of Wnt/B-catenin signalling 
in resistant clones re-establishes sensitivity to BET inhibition. a, Schematic 
representation of the Wnt/B-catenin pathway. Highlighted by green stars 

are components of the pathway identified from transcriptome data which are 
significantly upregulated (>1.5-fold change, FDR <0.05) in resistant clones 
relative to vehicle-treated clones. b, GSEA of previously published human LSC 
gene expression data demonstrates enrichment of the WNT/B-catenin 
pathway. c, Dkkl expression in the resistant cells before and after retroviral 
transduction of mouse Dkkl. qRT-PCR data from biological triplicate 
experiments (mean ~ s.d.). d, Partial restoration of sensitivity to BET 
inhibition is observed in resistant clones after transduction with Dkk1. Dose- 
response curve of a vehicle-treated clone and resistant clone with and without 
expression of Dkk1 after 72h of growth (mean + s.e.m., n = 16 per group). 

e, Restoration of BET inhibitor induced cell-cycle arrest in resistant clones 
stably transduced with Dkk1. Flow cytometric analysis after 48 h exposure to 
either vehicle or 1,000 nM I-BET in biological triplicate experiments 

(mean ~ s.e.m.). f, Resistant clones stably expressing Dkk1 do not show 
immunophenotypic enrichment for L-GMPs (see Extended Data Fig. 1k for 
comparison). Representative FACS analysis of resistant clone expressing Dkk1, 


P= 


0.07 P = 0.006 


percentages represent proportion of parent gate. g, Abrogation of Myc mRNA 
and protein expression in vehicle-treated clones after treatment with I-BET. 
qRT-PCR data of Myc expression in a vehicle-treated clone after 6 h of 
treatment with 1 1M I-BET151 in biological triplicate experiments 

(mean + s.d.). Uncropped blots are found in Supplementary Fig. 1. h, Negative 
regulation of Wnt/B-catenin signalling by Dkk1 in resistant clones results in 
decreased expression of Myc. qRT-PCR data from biological triplicate 
experiments (mean = s.d.). i, Small molecule inhibition of Wnt/B-catenin 
pathway expression re-establishes sensitivity to BET inhibition. Exposure of 
resistant clones to the Wnt/-catenin pathway inhibitor pyrvinium also results 
in re-expression of Grl* and CD11b’. Representative FACS analysis of 
resistant clone in the presence or absence of pyrvinium. j, Pyrvinium synergises 
with I-BET to induce a modest cell cycle arrest and an induction of cell 
death (sub-G0 cell fraction). Data from biological triplicate experiments 
(mean = s.e.m.). k, l, Pyrvinium reduces the expression of Wnt/B-catenin 
target genes such as Myc and Ccnd2 in vehicle-treated and resistant cells. dRT- 
PCR data from biological duplicate experiments (mean ~ s.e.m.). m, These 
findings are similar to those seen for resistant cells stably expressing Dkk1. 
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drug exposure to either vehicle (0.1% DMSO) or 1 1M I-BET in a vehicle performed in biological triplicate (mean + s.e.m.). e, saRNA-mediated 
clone transduced with an Apc shRNA. b, c, Independent shRNAs directed knockdown of Apc results in increased expression of Wnt/-catenin target gene 


against Apc confer resistance to vehicle-treated clones. Viable, shaRNA-positive | Myc. qRT-PCR data from FACS isolated shRNA containing cells performed 
cells after treatment with either vehicle or I-BET normalized to day 0 performed _ in biological duplicate (mean + s.e.m.). 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


#6 #7 #8 


eal #2 
a z 
oO 
fo>) 
= 
o 
f= 
S 9 | 623461 BBs 462 
z 
2 
Nw 
Do 
8 
51 [3.4] [2.7] [2.1] 1 


3.4 


#3 #4 #5 
Ba 416 hb, Lo. Bessea iy ij 
| 


| 
MLL-AF6 MLL-AF9  MLL-AF9 MLL-AFS MLL-AF6 MLL-ENL MLL-ENL MLL-AF9 
ia | 


a 


Ree pera 2.CC 
[% apoptosis - relative to vehicle treated sample] 3° CTNNB1 


[1.3] [1.2] [1.1] [1.0] 
GENE 
1.AXIN2 9 4. FZD5 
ND1 5. MY 


Cc 
6. TCF4 (E2-2) 


c 2 2 2 
i 2 tgs |e a = . 
a Be : ae : Be ‘ 
b ‘ bog aa | | a4 
9 3 = 8 it s Zo | Zo | Zo | 
= 8 = & ze S = a ad Pi . = F . 
1 °o . . o s . o Lr 
aT 7) 1. © ot et + 1 2 tt or et a ee OF 
MYC V4 ~~ AN AN e & 08 -4 -2 012 2-10 12 3 4 -1.0 0.0 05 10 1.5 
AXIN2 CCND1 CTNNB1 
0.6 
CCND1 7 7 7 
os e J . e J ; a4 
a a ay 
8 el] + go] . 8 ol 
AXIN2 02 s = 3 ae : § os : 
<0 | < ol < ol 
P 3 A 3 : 3 F 
FzD5 a 31 3" 3. * 
oe -15 -10 -05 0.0 -1 0 1 2 -2 0 2 4 
® @ 2 Oo 1 @ 04 FZD5 MYC TCF4 (E2-2) 
TCF4 (E2-2) 
‘ 0.6 wo | . 
B 
CTNNB1 & @ o & , 08 B 24 iz 
3 f 
4 a o | 
~ 3 F 
2 s* 
37 T 1 T 
0.0 0.5 1.0 15 
Measured 
d 3 e 
: % Apoptosis 1.0 1.0 1.0 1.0 
»X al A al al 
MLL-FUSION we Viable colle (Annexin V+/PI+) 2 = 2 2 
pmso | |-BET | puso | -BET 8 ce 3 8 
3 05 = 05 = 05 3 05 F 
MLL-AF6 51 49 85 2 2 2 2 
@ | is oS @ 
MLL-AF9 43 57 84 £ [ ES | | é 2 
0.0 LU 0.0 0.0 0.0 _ 
MLL-AF9 46 54 78 #1 a a a 
MLL-AF9 34 66 80 
MLL-AF6 55 45 59 10 ,, 1.0 > 0 ,, 1.0 
MLL-ENL 65 35 47 3 ; 3 [ 3 3 
> a Ps i 
MLL-ENL 68 32 36 @ 05 @ 0-5 @ 05 @ 05 
§ = 3 3 
MLL-AF9 73 27 30 s s 8 s | 
= 0.0 = 0.0 = 0.0 = 0.0 
#5 #6 #7 #8 
H puso 
 /-BET 


Extended Data Figure 10 | WNT/B-catenin pathway expression correlates 
with responsiveness to I-BET in primary human AML samples. a, Assess- 
ment of B-catenin pathway gene expression in eight primary human AML 
samples with associated response to I-BET exposure. Each panel represents an 
individual primary human AML sample, with genetic abnormality denoted. 
Waterfall plot of relative qRT-PCR expression data of key B-catenin pathway 
genes (AXIN2, CCND1, CTNNB1, FZD5, MYC, TCF4 (also known as E2-2)) is 
displayed. Each bar is labelled 1-6 according to gene represented. Relative 
apoptosis observed after 48 h exposure to 500 nM I-BET versus vehicle (0.1% 
DMSO) is denoted in square parenthesis and is also represented as a heat 
map background shading in each panel. b, logs-transformed expression levels 
of selected genes in the WNT/B-catenin pathway were measured using 


qRT-PCR. A corrgram shows the genes are highly correlated with each other. 
The colour and thinness of the ellipse indicate the strength of correlation 

(a line is perfect correlation; a circle is uncorrelated). The ellipse direction 
indicates the sign of the correlation (correlated: right/blue, inversely correlated: 
left/red). c, Expression of selected genes is correlated with apoptosis. 
Scatterplots show apoptosis versus the log, expression level of each gene. 
Expression of five genes (CCND1, CTNNB1, FZD5, MYC and TCF4) predicts 
apoptosis. The relationship is highlighted in a plot of apoptosis predicted 
using a multiple linear regression model with the five genes versus the actual 
data. d, Apoptosis observed after 48 h exposure to either vehicle (0.1% DMSO) 
or 500 nM I-BET across eight primary human AML samples. e, Relative 
viability of primary human AML samples after treatment with I-BET. 
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Transcriptional plasticity promotes primary and 
acquired resistance to BET inhibition 


Philipp Rathert!*, Mareike Roth'*, Tobias Neumann!, Felix Muerdter!, Jae-Seok Roe’, Matthias Muhar', Sumit Deswal!, 
Sabine Cerny-Reiterer*, Barbara Peter**, Julian Jude', Thomas Hoffmann!, Lukasz M. Boryn’, Elin Axelsson, 
Norbert Schweifer?, Ulrike Tontsch-Grunt®, Lukas E. Dow®, Davide Gianni’, Mark Pearson’, Peter Valent®*, Alexander Stark’, 


Norbert Kraut®, Christopher R. Vakoc” & Johannes Zuber’ 


Following the discovery of BRD4 as a non-oncogene addiction target 
in acute myeloid leukaemia (AML)'”, bromodomain and extra ter- 
minal protein (BET) inhibitors are being explored as a promising 
therapeutic avenue in numerous cancers’ *. While clinical trials have 
reported single-agent activity in advanced haematological malignan- 
cies*, mechanisms determining the response to BET inhibition 
remain poorly understood. To identify factors involved in primary 
and acquired BET resistance in leukaemia, here we perform a 
chromatin-focused RNAi screen in a sensitive MLL-AF9;Nras@7?- 
driven AML mouse model, and investigate dynamic transcriptional 
profiles in sensitive and resistant mouse and human leukaemias. Our 
screen shows that suppression of the PRC2 complex, contrary to 
effects in other contexts, promotes BET inhibitor resistance in 
AML. PRC2 suppression does not directly affect the regulation of 
Brd4-dependent transcripts, but facilitates the remodelling of regu- 
latory pathways that restore the transcription of key targets such as 
Myc. Similarly, while BET inhibition triggers acute MYC repression 
in human leukaemias regardless of their sensitivity, resistant leukae- 
mias are uniformly characterized by their ability to rapidly restore 
MYC transcription. This process involves the activation and recruit- 
ment of WNT signalling components, which compensate for the loss 
of BRD4 and drive resistance in various cancer models. Dynamic 
chromatin immunoprecipitation sequencing and self-transcribing 
active regulatory region sequencing of enhancer profiles reveal that 
BET-resistant states are characterized by remodelled regulatory land- 
scapes, involving the activation of a focal MYC enhancer that recruits 
WNT machinery in response to BET inhibition. Together, our results 
identify and validate WNT signalling as a driver and candidate bio- 
marker of primary and acquired BET resistance in leukaemia, and 
implicate the rewiring of transcriptional programs as an important 
mechanism promoting resistance to BET inhibitors and, potentially, 
other chromatin-targeted therapies. 

BRD4 is a chromatin reader that regulates transcription through 
linking histone acetylation and core components of the transcriptional 
apparatus’. Recent studies suggest that BRD4 regulates distinct gene 
sets through interacting with context-specific enhancers and transcrip- 
tion factors*”. However, the mechanisms underlying the wide range of 
sensitivity to BET inhibition remain elusive, and so far no predictive 
biomarker has been identified. Towards understanding these mechan- 
isms, we sought to functionally identify chromatin factors that are 
required for rendering AML cells sensitive to JQ1, a well-known BET 
inhibitor’®. To this end, we constructed a microRNA-embedded short 
hairpin RNA (shRNAmir) library covering 626 chromatin regulators 
and screened it in the same MLL—AF9;Nras@!??-driven model that led 
to the identification of BRD4 as a candidate target’ (Fig. 1a). Deep- 
sequencing following transduction (TO) and 7 days of selection (T1) 


identified chromatin-associated dependencies, including Smarca4 and 
Brd4 as top hits (Extended Data Fig. la—c). To control for unspecific 
events, we mixed the GFP™ library population with mCherry* control 
cells, and subsequently treated with DMSO or JQ1. While mCherry~ 
cells disappeared over time (Fig. 1b), GFP™ cells survived and even- 
tually grew in the presence of 50nM JQ1 (corresponding to an IC7 
dose; Extended Data Fig. 1d), indicating that resistance emerged from 
shRNA-mediated effects. Four shRNAs showed an outstanding enrich- 
ment that was consistent between replicates, despite almost 5 weeks of 
independent culture (Fig. 1c and Extended Data Fig. la, e). These 
included two shRNAs targeting Suz12, one targeting Psip1, and one 
previously characterized potent Dnmt3a shRNA". All four shRNAs 
strongly suppressed their target mRNA (Extended Data Fig. If), and 
validated to promote JQI resistance in single assays (Fig. 1d and 
Extended Data Fig. 2a). 

The finding that suppression of Suz12, a component of the PRC2 
complex, promotes resistance to JQ1 was surprising in two ways. First, 
a recent report, based on studies in nerve sheath tumours, has impli- 
cated Suz12 deficiency as a condition that increases sensitivity to BET 
inhibition’*. Second, several studies (including work from our group) 
have characterized PRC2 as a requirement in MLL-AF9-driven 
AML”, Notably, the most potent Suz12 shRNA in previous studies 
(Suz12.1676) did not score in the pooled screen and validated to 
strongly inhibit proliferation in our model (Fig. le). However, when 
we added JQ1, Suz12.1676-expressing cells were rapidly enriched, indi- 
cating that Suz12 deficiency turns from a detrimental into a favourable 
condition. Similar effects were observed using potent shRNAs targeting 
Ezh2 and Eed, two other PRC2 components (Fig. le and Extended Data 
Fig. 2b). We also validated this phenomenon using Tet-regulatable 
RNA interference, where we included a validated Myc shRNA” to rule 
out that resistance is merely a consequence of reduced proliferation 
(Extended Data Fig. 2c). Resistant cells generated through Suz12 sup- 
pression showed a global loss of H3K27me3 (Fig. 1f), and were also 
refractory to the effects of JQ] in methyl-cellulose assays (Extended 
Data Fig. 2d) and in vivo (Fig. 1g). While recipient mice of Suz12- 
deficient cells, consistent with previous observations’, showed a 
delayed disease progression, JQ1 had no anti-leukaemic effects in this 
context (Fig. 1g). Together, these data demonstrate that loss of Suz12 
impairs rather than promotes BET sensitivity in AML, revealing 
another scenario where PRC2 plays opposing roles in different cancers. 

To search for underlying mechanisms, we investigated whether our 
hits affected the regulation of BRD4-dependent transcripts, and found 
that the acute response to JQ] is largely unchanged in the absence of 
Suz12, Dnmt3a or Psip1 (Fig. 2a). However, under long-term treat- 
ment, 60-80% of JQ1-induced changes reverted and transcription of 
the key target Myc’ was restored (Fig. 2a—c and Extended Data Fig. 2e). 


1Research Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), 1030 Vienna, Austria. ?Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 11724, USA. Department of Internal 
Medicine |, Division of Hematology and Hemostaseology, Medical University of Vienna, 1090 Vienna, Austria. “Ludwig Boltzmann Cluster Oncology, Medical University of Vienna, 1090 Vienna, Austria. 
5Boehringer Ingelheim — Regional Center Vienna GmbH, 1121 Vienna, Austria. (Department of Medicine, Hematology & Medical Oncology, Weill Cornell Medical College, New York 10065, USA. 


*These authors contributed equally to this work. 


24 SEPTEMBER 2015 | VOL 525 | NATURE | 543 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a Chromatin library 


Murine AML 
i Infection G418 ® ° DMSO 
——_= 75% & 
ft (Triplicate) (7 d) ® S 
Neo SSRN 
626 genes 


Neo SI ——> —+ 
Empty mir30 control 


Mix 


2,917 shRNAs 


25% 


b S49 soo 
3 600 = > 
@ 20 400 28  Suz12.909 
aes 
£10 200 Se e Dnmt3a.16 
& a F g © Psip1.2474 
E 78 1216 19 22 29 = er Suei2. 1842 
Days JQ1 treatment 
d 10 
24 Ctrl shRNA Suz12.1842 
3 8 
8 
+ 
a o8 t 6 
6 Jai i 
o 
204 SMO 5 Jat -10-8 -6 -4-2 0 2 4 6 8 10 
& ~_ ; 
3S |. 50 nMvat ; t JQ1 read ratio T2/T1 (log,) 
0 4 8 12 o 4 8 12 g Ren.713 Suz12.1676 
o Days Days DMSO JQ1 DMSO" JQ1 
2 $uz12.1676 Ezh2.781 G 
8 
i 
rm Jat 
@ 4 
g 
Ss 
oO 
ac 
478 12 
Days 
Suz12 4100 
f Ren713 4676 1842 = 100 pMsO 
JQ1 (time) - 24h LT LT % 50 5 -—JQ1 
Suz12 — —_— $ 
— 5 |p=o0.0042 P = 0.0437 
H3k27me3 [—= = G oO 0 
0 10 20 30 0 10 20 30 
Rn—— | Days Days 


Figure 1 | Multiplexed shRNAmir screening identifies chromatin factors 
that prevent resistance to BET inhibition. a, Schematic of the multiplexed 
screening strategy. Mouse MLL-AF9;Nras“!*” AML cells were infected with a 
library targeting 626 chromatin-associated genes (GFP) or empty control 
(mCherry’ ). G418-selected cell populations were mixed and treated with 
DMSO or 100nMJQ1 for 4 days followed by 50 nM JQ] for 22 days. Genomic 
DNA isolated from TO, T1 and T2 was used to amplify and deep-sequence 
shRNA guides. b, Relative abundance of mCherry” control cells and absolute 
number of GEP*/shRNA-expressing cells in the screen population over time. 
c, Scatter plot showing the average ratio of normalized reads before (T1) and 
after 26 days of DMSO- or JQ] treatment (T2) for all 2,917 shRNAs. 

d, e, Competitive proliferation assays of MLL-AF9;Nras@!”? leukaemia cells 
expressing the indicated shRNAs. Shown is the relative fraction of GFP*/ 
shRNA‘ cells relative to the initial measurement. After 10 days, each sample 
was split in half, treated with DMSO or 50 nM JQ] and analysed over 8 days. 
f, Immunoblotting of Suz12 and H3K27me3 in resistant AML cells expressing 
the indicated Suz12 shRNAs (LT, long-term culture in 50 nM JQ1 for 6 weeks) 
and Ren.713 control cells (with or without treatment with 200 nM JQ1 for 
24h). g, Top, bioluminescent imaging of mice transplanted with 1 x 10° MLL- 
AF9;Nras@!P leukaemia cells expressing the indicated shRNAs. Treatment 
with JQ1 (50 mgkg ' per day) or DMSO carrier was initiated at day 3 after 
transplantation. Bottom, Kaplan-Meier survival curves of control and JQ1- 
treated mice (n = 5). Statistical significance was calculated using a log-rank test. 


To investigate this rebound phenomenon, we focused subsequent ana- 
lyses on PRC2, a writer of repressive H3K27me3 marks’*®. Based on 
recent reports showing that BRD4 interacts with WHSCI1L1”, a writer 
of H3K36 methylation marks that can recruit PRC2'’, we wondered 
whether PRC2 is involved in repressing BRD4 targets following BET 
inhibition. However, chromatin immunoprecipitation sequencing 
(ChIP-seq) analysis before and after JQ1 treatment showed no increase 
in H3K27me3 at Brd4 occupancy sites (Extended Data Fig. 2f, g), which 
together with RNA-seq studies suggests that PRC2 does not regulate 
these genes directly. 

Next, we investigated whether the rebound of BRD4 targets is assoc- 
iated with changes in the enhancer landscape. Under long-term JQ1 
treatment, resistant cells showed a global loss of enhancer-associated 
H3K27ac marks, which was reversible upon drug withdrawal 
(Extended Data Fig. 3a). At the level of Myc, the loss of H3K27ac 
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Figure 2 | BET-resistant AML cells restore the transcription of key Brd4 
target genes through remodelling of regulatory landscapes and pathways. 
a, Heat map showing RPKM fold change (FC) of 235 response genes in 
MLL-AF9;Nras@?? leukaemia cells expressing the indicated shRNAs after 
treatment with DMSO or JQ1 (200nM) for 2h or 24h, and after long-term 
(LT) JQ1 treatment (50nM) for 6 weeks. Response genes were defined 
based on JQ1-induced changes in Ren.713 expressing leukaemia cells 

(FC >2/<0.5 after 2h and FC >2/<0.33 after 24h). Samples are presented 
according to non-hierarchical clustering; genes are grouped in up- or 
downregulated targets in Ren.713 control cells, and sorted by their average 
re-expression in resistant leukaemia. b, Response genes in a were 
grouped into four categories based on the divergence of their expression in 
resistant AML compared to AML expressing Ren.713 (see legend of 
Extended Data Fig. 2e for details). c, Time course of Tifab and Myc 
transcript levels in MLL-AF9;Nras@!?? leukaemia expressing the indicated 
shRNAs, relative to untreated cells expressing Ren.713. d, ChIP-seq 
occupancy profiles of Brd4 and H3K27ac in Myc regulatory regions in 
sensitive AML cells (wild type and Ren.713) and Suz12.1842-expressing 
resistant AML cells under long-term treatment with JQ1 (50nM). The 
H3K27ac profile in Ren.713 control AML cells is shown on top for the 
entire region (~2 Mb), together with validated transcript models from the 
mm10 genome assembly. Additional tracks are zoomed in on the proximal 
region and a distal region containing an established cluster of enhancers 
(E1-E5)"", as indicated (dotted lines). The y axis reflects the number of 
normalized cumulative tag counts in each region. e, Schematic of RNA-seq 
profiling in resistant AML cells generated through expression of three 
independent shRNAs targeting Suz12 or Ezh2, which are treated as 
biological triplicates and compared to triplicate control cells. f, Gene set 
enrichment analysis of expression changes in 22 KEGG signalling pathways 
in resistant AML cells generated as described in e. Plotted are normalized 
enrichment scores (NES) against nominal P values; significantly altered 
gene sets (P<0.05) are indicated in the graph. 
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was most prominent at a 3’ cluster of lineage-specific enhancers 
known to be required for Myc transcription’ (Fig. 2d). In parallel, 
we observed a strong focal gain of H3K27ac in the first intron of 
Pyt1 (a long non-coding RNA 3’ of Myc), which was the tenth most 
prominent of only 119 genomic regions that gained H3K27ac in res- 
istant cells (Fig. 2d and Extended Data Fig. 3b, c). Together, these 
findings reveal that BET resistance is associated with profound 
changes in the enhancer landscape. 

To probe regulatory pathways involved in the re-expression of Brd4 
targets, we analysed the transcriptomes of sensitive and resistant cells 
generated using three independent PRC2 shRNAs (Fig. 2e). Gene set 
enrichment analysis” identified alterations in five major signalling 
pathways (Fig. 2f). Almost half of the genes associated with upregu- 
lated pathways contain H3K27me3 marks in sensitive cells (Extended 
Data Fig. 3d-f), indicating that loss of PRC2 can facilitate their tran- 
scriptional activation. The most significant alteration was an upregu- 
lation of genes associated with Wnt signalling, which can drive Myc 
transcription”' and has important functions in normal and leukaemic 
stem cells (LSC). Of note, resistant cells in our model were not 
enriched for LSC-associated surface markers and expression signa- 
tures (Extended Data Fig. 4a—c), which does not exclude a role of 
LSC in resistance. Collectively, our findings suggest that loss of 
PRC2 promotes resistance through facilitating the derepression of 
compensatory pathways that restore the transcription of critical 
BRD4 target genes. 

Given this complexity, we sought to explore whether any phenom- 
ena in our model of acquired resistance are also associated with prim- 
ary BET resistance. To this end, we determined the JQ1 sensitivity in 
246 human cell lines (Extended Data Fig. 5a), and selected three sens- 
itive and three resistant lines from three cancer subtypes for dynamic 
transcriptional profiling. While RNA-seq profiles distinguished cancer 
subtypes as expected, sensitive and resistant contexts could not be 
differentiated, neither through their steady-state transcriptomes, nor 
through comparing JQ1-induced changes (Extended Data Fig. 5b, c). 
In fact, BET inhibition triggered highly distinct responses regardless of 
whether MYC is affected, and even cancers of similar tissue and sens- 
itivity profile do not share greater overlaps (Fig. 3a and Extended Data 
Fig. 5d—f). To test whether this heterogeneity is associated with differ- 
ential functions of JQ1 targets, we used Tet-regulated shRNAmirs to 
profile BRD2/3/4-dependent transcripts in sensitive and resistant 
leukaemia cells. In both cases, JQ1-induced changes showed the largest 
overlap with BRD4-knockdown profiles (Extended Data Fig. 6a—d), 
indicating that suppression of BRD4 dominates the effects of BET 
inhibition in leukaemia. 

Despite this heterogeneity, a closer look at the few commonly regu- 
lated genes (Extended Data Fig. 6e) revealed two interesting findings: 
(1) one of two transcripts commonly induced by JQ1 was HEXIM1, 
which we found upregulated in all analysed cancers (Extended Data 
Fig. 6f), indicating that it could serve as a pharmacodynamic biomar- 
ker. (2) Surprisingly, one of three transcripts repressed after 2 h of JQ1 
treatment in both sensitive and resistant leukaemias was MYC. In 
sensitive leukaemias, this repression was durable and led to strong 
suppression of the MYC protein (Fig. 3b), while all resistant leukae- 
mias showed a rapid rebound of MYC transcription and, consequently, 
only minimal protein suppression. Similar effects were observed in a 
larger panel of haematopoietic cell lines (Fig. 3c and Extended Data 
Fig. 6g), indicating that primary BET resistance is not due to BRD4- 
independent regulation of critical targets, but driven by compensatory 
mechanisms that rapidly restore their transcription. 

To investigate how this rebound phenomenon is encoded in the 
enhancer landscape, we performed H3K27ac ChIP-seq, which revealed 
that resistant K-562 cells contain several H3K27ac peaks around MYC 
that are missing in sensitive MOLM-13 cells. These include a strong 
occupancy in the first intron of PVT1 (Fig. 3d), the same region that 
gained H3K27ac in our mouse model (Fig. 2d). To systematically ana- 
lyse how these putative MYC enhancers change their activity upon BET 
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Figure 3 | Dynamic transcriptional and enhancer profiling of sensitive and 
resistant cancer cell lines. a, Venn diagram depicting the overlap of expression 
changes in three sensitive AML lines after 2 h of JQ1 (200 nM). b, Top, MYC 
mRNA levels in indicated cell lines after 2h or 24h of JQ1 treatment (200 nM), 
relative to DMSO-treated cells. Bottom, MYC immunoblotting in indicated 
cell lines before and after 24h of JQ1. c, MYC mRNA levels across six sensitive 
and six resistant cell lines at indicated time points following JQ1 (200 nM), 
normalized to DMSO (mean = s.e.m.; Student’s t-test; **P = 0.01; 

**P < 0.01). d, ChIP-seq occupancy profiles of H3K27ac at the MYC locus in 
K-562 and MOLM-13 cells. Validated transcript models from the hg19 genome 
assembly are shown above; y axes reflect normalized cumulative tag counts in 
each region. e, STARR-seq fragment densities in K-562 cells treated with 
DMSO or 250 nM JQ1. Read densities are shown as unique reads per million. 
f, Sorted ratios of STARR-seq signals in DMSO versus 250 nM JQ1-treated 
cells for all 156 identified STARR-seq peaks. g, Luciferase reporter assay 
measuring the enhancer activity of the PVT1 element and an unrelated 
enhancer (PLEK2) on a minimal MYC promoter. Shown are fold changes of 
normalized luciferase signal over background +250 nM JQ1 (n = 3; mean + 
s.e.m., Student’s t-test with Welch’s correction). 


inhibition, we performed self-transcribing active regulatory region 
sequencing (STARR-seq; a new method for high-throughput func- 
tional enhancer analysis”) using a library covering 3.1 megabase pairs 
surrounding MYC (Extended Data Fig. 7a). While the PVT1 element 
had little activity in DMSO-treated K-562 cells, its enhancer activity 
strongly increased after 24h of JQ1 treatment (Fig. 3e) and defined the 
single strongest gain in the entire region (Fig. 3f). We confirmed this 
effect in conventional reporter assays using a minimal MYC promoter 
(Fig. 3g) and found that it does not correlate with PVT! transcription 
(Extended Data Fig. 7b), indicating that this element primarily acts asa 
MYC enhancer. Collectively, our results suggest that the rebound of 
MYC is driven by a focal enhancer that is formed during acquired 
resistance and pre-established in primary resistant cells. 

To search for regulatory pathways driving primary resistance, we 
compared steady-state transcriptomes of sensitive and resistant leukae- 
mias. Among only 38 genes consistently upregulated in resistant leu- 
kaemias, we identified 17 genes previously implicated as Wnt signalling 
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Figure 4 | Wnt signalling promotes primary and acquired BET resistance in 
leukaemia. a, Heat map of genes differentially expressed between indicated 
sensitive and resistant leukaemia cell lines. Genes previously implicated as Wnt 
target genes are highlighted in red. b, Competitive proliferation assay of 
K-562 cells expressing the indicated shRNAs. Shown is the fraction of GFP*/ 
shRNA* under JQ] treatment (100 nM), normalized to the fraction in DMSO- 
treated control cells (n = 3, mean + s.e.m.). ¢, RT-PCR analysis of MYC 
mRNA levels in K-562 cells expressing the indicated shRNAs at different time 
points after JQ1 treatment (200 nM) (n = 3; mean + s.e.m.; Student’s t-test). 
d, Competitive proliferation assays of MLL-AF9;Nras@!7 leukaemia cells 
expressing the indicated shRNAs. Shown is the fraction of mCherry" /shRNA* 
cells relative to the initial measurement. After 2 days, each sample was split in 
half, treated with DMSO or 50nMJQ1 and followed up for 16 days. 

e, Competitive proliferation assays of MLL-AF9;Nras@!”” leukaemia cells co- 
expressing mCherry and Ctnnb1 harbouring one (Ctnnb1.1x) or four 
(Ctnnb1.4x) activating mutations. After 2 days, cells were treated with 

50 nM JQ1 or DMSO, and the relative fraction of mCherry*/ shRNA cells was 
followed over time. f, RT-PCR analysis of TCF4, CCND2 and HOXB4 mRNA 
levels (relative to GAPDH) in sensitive (n = 6; ICs) < 200 nM) and resistant 
(n = 6; IC59 > 500 nM) primary human leukaemia samples (Student’s f-test; 
*P < 0.05; **P < 0.01). g, Model of transcriptional plasticity as a driver of 
primary and acquired BET resistance. 


targets (Fig. 4a), which together with findings in our mouse model 
(Fig. 2f) pointed to Wnt signalling as a candidate driver of BET resist- 
ance. From these 17 genes, we selected two well-established Wnt tar- 
gets, IGF2BP1*° and TCF4’*, to test whether they contribute to the 
resistant state of K-562cells. Suppression of TCF4 or IGF2BP1 
increased the JQ] sensitivity and diminished transcriptional rebound 
phenomena in K-562 cells (Fig. 4b, c and Extended Data Fig. 8a), while 
overexpression of TCF4 in MOLM-13 cells reduced their sensitivity 
(Extended Data Fig. 8b, c). 

To investigate whether resistant cells engage Wnt transcriptional 
machinery following BET inhibition, we performed ChIP assays for 
TCF7L2, a Wnt-dependent transcription factor known to drive MYC 
and other Wnt targets in complex with CTNNB1”’. Following JQ1 
treatment, established Wnt target genes showed a marked increase in 
TCF7L2 binding, which was particularly prominent at the PVT 
element 3’ of MYC (Extended Data Fig. 8d), suggesting that this site 
acts as a Wnt-dependent MYC enhancer. To test whether Wnt activa- 
tion drives de novo resistance, we transduced cells of our sensitive 
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AML mouse model with validated shRNAs targeting Apc (Extended 
Data Fig. 8e, f), a negative regulator of Wnt signalling, or active 
mutants of Ctnnbl. While Wnt activation had no or slightly det- 
rimental effects in untreated cells, JQ1 treatment led to a rapid out- 
growth of cells expressing two independent Apc shRNAs or Ctnnb1 
containing four activating mutations (Ctnnbl.4x) (Fig. 4d, e and 
Extended Data Fig. 8g). Single-mutant Ctnnbl had milder effects, 
indicating that the degree of BET resistance depends on Wnt activa- 
tion levels. Ctnnb1.4x also completely blunted the response to JQ1 in 
vivo, and promoted resistance in two independent leukaemia models 
(Extended Data Fig. 8h, i). 

After demonstrating that Wnt activation drives resistance in leuk- 
aemia, we wondered whether our Wnt expression signature would be 
more generally associated with BET resistance. Through integrating 
sensitivity profiles of 246cell lines with available transcriptome 
data’’’*, we found that expression of this gene set is significantly 
increased in JQ1-resistant contexts (Extended Data Fig. 9a, b). As a 
first step towards probing Wnt as a BET resistance driver in other 
cancers, we found that suppression of Apc promotes resistance in a 
mouse model of pancreatic cancer, while treatment with the Wnt inhib- 
itor pyrvinium” synergizes with JQ] in this model and highly resistant 
ASPC-1 cells (Extended Data Fig. 9c-e). To probe whether Wnt sig- 
nalling is associated with BET resistance in primary human leukaemia, 
we quantified nine Wnt-associated transcripts in sensitive and resistant 
patient-derived samples (Extended Data Fig. 10a). Notably, three of 
these transcripts (that is, TCF4, CCND2 and HOXB4) were significantly 
overexpressed in resistant samples (Fig. 4f), while others showed a 
similar trend (Extended Data Fig. 10b). To reduce context-specific 
biases of single markers, we used the three significant transcripts to 
establish a simple ‘resistance index’, which strongly correlated with ICso 
values (Extended Data Fig. 10c) and may provide a first step towards 
developing a predictive biomarker. 

Through integrative profiling and functional genetic analyses in 
mouse and human leukaemia, our study reveals that BRD4 regulates 
a remarkably specific and diverse set of target genes. Leukaemia cells, 
through an adaptation process that is facilitated by inactivation of 
PRC2, can become resistant to BET inhibition by rewiring the tran- 
scriptional regulation of key BRD4 targets such as MYC (Fig. 4g). 
Interestingly, a recent study found that in T-lymphoblastic leukaemia 
MYC transcription can switch from a NOTCH1-dependent to a 
BRD4-dependent mode following treatment with y-secretase inhibi- 
tors**. Our study reveals that the BRD4-dependency of MYC can be 
overcome through engaging a focal enhancer and, together with 
another report*’, establishes WNT signalling as a major driver and 
candidate biomarker of BET resistance in leukaemia. These findings 
highlight that the heterogeneity and plasticity of transcriptional 
machinery plays a major role in promoting resistance to chromatin- 
targeted therapies. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 

Plasmids. The shRNA library used in the screen was cloned into pLMN (pMSCV- 
miR30-PGK-NeoR-IRES-GFP)'*. The same vector was used for validation of 
primary hits and PRC2 core complex partners. Additional validation studies for 
Suz12 were performed using pRT3GEN-mir30 (pSIN-TRE3G-turboGFP-miR30- 
PGK-NeoR) or pLENC (pMSCV-miRE-PGK-NeoR-IRES-mCherry)'’. shRNAs 
targeting Apc were cloned into pLMPC (pMSCV-miRE-PGK-PuroR-IRES- 
mCherry). A Ctnnbl cDNA harbouring four activating mutations (S33A, S37A, 
T41A, S45A; Ctnnb1.4x) was obtained from Addgene (#24312) and cloned into 
pMSCV-IRES-mCherry. Ctnnb1 harbouring a single mutation (S45P; Ctnnb1.1x) 
was cloned into pMSCV-IRES-GFP. shRNAs targeting TCF4 and IGF2BP1 were 
cloned into pRT3GEN (pSIN-TRE3G-turboGFP-miRE-PGK-NeoR)”, and the 
TCF4 cDNA was cloned into pMSCV-PGK-NeoR-IRES-GFP. 

Antibodies. The following antibodies were used: Suz12 (39357, Active Motif), 
histone H3K27me3 (39155, Active Motif; 07-449, Millipore), histone H3K36me3 
(Ab9050, Abcam), histone H3K27ac (ab4729, Abcam), histone H3 (61277, Active 
Motif), MYC (56058, Cell Signaling), TCF4 (ab130014, Abcam), IGF2BP1 
(RNOO1M, MBL), TCE7L2 (sc-8631, Santa Cruz), BRD4 (A301-985A1002, 
Bethyl), BRD3 (A302-583A, Bethyl), BRD2 (A302-368A, Bethyl), APC 
(MABC202, Millipore) and f-actin (A3854, Sigma-Aldrich and ab49900, 
Abcam). Secondary antibodies were anti-mouse (926-32210, LI-COR), anti-rabbit 
(926-32211, LI-COR) and anti-goat (926-32214, LI-COR). Antibodies used for 
FACS were: PE anti-mouse CD117/c-Kit (Biolegend, 105808), APC anti-mouse 
CD11b/Mac-1 (101212, Biolegend), PE-Cy7 anti-mouse Ly-6G/Gr-1 (108416, 
Biolegend), PE-Cy7 anti-mouse Ly-6A/E/Sca-1 (108114, Biolegend), PerCP/ 
Cy5.5 anti-mouse CD45.2 (109828, Biolegend), PE/Cy7 anti-mouse CD117/c- 
Kit (105814, Biolegend), APC anti-mouse CD150 (115925, Biolegend), PE-Cy7 
anti-human CD11B/MAC-1 (101215, Biolegend) and APC anti-human CD36 
(336207, Biolegend). 

Pooled shRNAmir screening. An shRNA library targeting 626 chromatin- 
associated mouse genes (Supplementary Information; shRNAs prim. screen per- 
formance) was assembled in pLMN”* by combining an existing chromatin-focused 
library (243 genes, 1094 shRNAs)! and 1793 additional shRNAs that were designed 
based on optimized sensor rules*”*’, and cloned, sequence-verified and pooled as 
previously described’. After spiking in several control shRNAs at equimolar ratios, 
the final pool of 2917 pLMN-shRNAs was transduced in triplicate into the same 
MLL-AF9;Nras“!??-driven model that we had previously used to identify BRD4 as a 
target in AML’. To ensure library representation a total of 60 million cells were 
infected with 5% transduction efficiency using conditions that predominantly lead 
toa single retroviral integration and represent each shRNA in a calculated number of 
> 800 cells. Throughout G418 drug selection (1 mg ml’) more than 50 million cells 
were maintained at each passage to preserve library representation. Two days after 
infection TO samples were acquired (6 million GFP* cells per replicate) using a FACS 
Ariall (BD Biosciences) and deep-sequencing analysis confirmed that the library was 
well represented in all replicates. After G418 drug selection for 7 days, T1 samples 
were obtained using FACS (6 million GFP* cells per replicate), and cells were sub- 
sequently cultured in the presence of 0.5 mg ml”! G418 (15 million cells per replicate 
were maintained at each passage). To control for unspecific clonal events during the 
JQI resistance screen, stably selected cells were mixed in 4:1 ratio with G418-selected 
cells expressing an empty miR-30 cassette, mCherry and NeoR (pMSCV- 
miR30.empty-PGK-NeoR-IRES-mCherry). Replicates were subsequently treated 
for 4days with 100nMJQ1 followed by 50nM/JQ1 for 22 days. Culture medium 
was exchanged every 2 days until cells proliferated comparable to wild-type cells and 
afterwards passaged seven times. Three replicates cultured in the presence of vehicle 
(0.033% DMSO) were maintained in parallel. After 22 days (T2), about 6 million 
shRNA-expressing (GFP*) cells were sorted for each replicate by FACS. While 
mCherry™ cells completely disappeared over time, GEP* cells survived and even- 
tually grew in the presence of 50nMJQ1 (corresponding to an ICz» dose in this 
model), indicating that resistance did not emerge from random clonal events, but 
from shRNA-mediated effects. Genomic DNA from T1 and T2 samples was isolated 
by two rounds of phenol extraction using PhaseLock tubes (5PRIME), followed by 
isopropanol precipitation. Deep-sequencing libraries were generated by PCR amp- 
lification of shRNA guide strands using primers that tag the product with standard 
Illumina adapters (p7+Loop: CAAGCAGAAGACGGCATACGATAGTGAAGC 
CACAGATGT; p5+PGK: AATGATACGGCGACCACCGATGGATGTGGAAT 
GTGTGCGAGG). For each sample, DNA from at least 5 x 10° cells was used as 
template in multiple parallel 50-111 PCR reactions, each containing 0.5 jig template, 
1X AmpliTaq Gold buffer, 0.2mM of each dNTP, 2mM MgCh, 0.3 uM of each 
primer and 2.5 U AmpliTaq Gold (Life Technologies), which were run using the 
following cycling parameters: 95 °C for 10 min; 35 cycles of 95 °C for 30s, 52 °C for 
45s and 72 °C for 60s; 72°C for 7 min. PCR products (340 bp) were combined for 
each sample, column purified using the QIAquick PCR purification kit (Qiagen) and 
further purified on a 1% agarose gel (QlAquick gel extraction kit, Qiagen). Libraries 


were analysed on an Illumina HiSeq 2000 deep sequencer; 22 nucleotides of the guide 
strand were sequenced using a custom primer (miR30EcoRISeq, TAGCCCCTT 
GAATTCCGAGGCAGTAGGCA). To provide a sufficient baseline for detecting 
shRNA depletion in experimental samples, we aimed to acquire >500 reads per 
shRNA in the sequenced shRNA pool to compensate for variation in shRNA rep- 
resentation inherent in the pooled plasmid preparation or introduced by PCR biases. 
With these conditions, we acquired baselines of > 500 reads for all 2,917 shRNAs. 
Sequence processing was performed using a customized Galaxy platform™. For each 
shRNA and condition, the number of matching reads was normalized to the total 
number of library-specific reads per lane and imported into a database for further 
analysis (Access 2013, Microsoft). All primary screen data are provided under 
Supplementary Information; shRNAs prim. screen performance. 

Cell culture, retroviral gene transfer and RNAi studies. Mouse MLL- 
AF9;Nras@!?P AML cells (RN2)! and other murine leukaemia cell lines were derived 
from bone marrow of terminally diseased mice transplanted with fetal liver cells 
engineered to express the indicated oncogenes, as previously described**. Murine 
leukaemia cells were cultured in RPMI 1640 (Gibco-Invitrogen) supplemented with 
10% FBS, 20mM glutamate, 10mM sodium pyruvate, 10 mM HEPES (pH 7.3), 
100U ml penicillin and 100,1gml' streptomycin. MV4-11, MEG-01 and 
K-562 cells were grown in IMDM with 10% FBS, 20 mM glutamate, 10 mM sodium 
pyruvate, 100 U ml penicillin and 100 jig ml“ streptomycin. All human AML cell 
lines were cultured in RPMI 1640 (Gibco-Invitrogen) supplemented with 10% FBS, 
20 mM glutamate, 10 mM sodium pyruvate, 100 U ml penicillin and 100 pgml~* 
streptomycin, except KASUMI-1 cells, which were cultured in 20% FBS. Human 
PDAC cell lines CAPAN-2 and MIAPACA-2 were cultured in DMEM (Gibco- 
Invitrogen) supplemented with 10% FBS, 20 mM glutamate, 10 mM sodium pyr- 
uvate, 100 U ml penicillin and 100 pg ml’ streptomycin. SU-8686 and ASPC-1 
were cultured in RPMI 1640 (Gibco-Invitrogen) supplemented with 10% FBS, 
20mM glutamate, 100 U ml penicillin and 100 »gml“' streptomycin. HUPT-4 
and HPAF-2 were cultured in EMEM (Sigma-Aldrich) supplemented with 10% 
FBS, 20mM _ glutamate, 100Uml”' penicillin and 100g ml! streptomycin. 
Human SCLC cell lines DMS-273, SHP-77, NCI-H82 and NCI-H1048 were cultured 
in RPMI 1640 (Gibco-Invitrogen) supplemented with 10% FBS, 20 mM glutamate, 
10% GlutaMAX, 100U ml" penicillin and 100 1g ml streptomycin. Cell lines 
were obtained from ATCC (http://www.lgcstandards-atcc.org/en.aspx) or DSMZ 
(http://www.dsmz.de/) and tested for mycoplasma infection on a regular basis using 
a commercial biochemical test (Lonza). 

Tet-on competent murine pancreatic cancer cells (KRPC2) were generated and 
characterized as previously described”*. In brief, pancreatic progenitor cells iso- 
lated from a murine fetus (ED17.5-18.5) harbouring a conditional endogenous 
Kras@!? allele (lox-STOP-lox-Kras©!?”)?” and a conditional Trp53 deletion 
(Trps3 38 were transduced with retroviral constructs expressing Myc, 4-OHT- 
inducible CreER'?, codon-optimized Firefly Luciferase (Luc2), and rtTA3, and 
orthotopically injected into the pancreas of syngeneic recipient mice (detailed 
protocols available upon request). Emerging tumours were characterized to har- 
bour histological features of human pancreatic adenocarcinoma and used to derive 
a cell line (KRPC2), which was cultured in DMEM (Gibco-Invitrogen) with 10% 
FBS, 20mM glutamate, 10mM sodium pyruvate, 100U ml! penicillin and 
100 pg ml! streptomycin. 

Retroviral packaging was performed using Platinium-E cells (Cell Biolabs) accord- 
ing to established protocols’’. In brief, for each calcium phosphate transfection, 
10-20 pg plasmid DNA and 57g helper plasmid (pCMV-Gag-Pol, Cell Biolabs) 
were used. Transduction efficiencies of retroviral constructs were measured 48h 
post infection by flow cytometry (Guava easyCyte, Millipore). Transduced cell popu- 
lations were usually selected 48 h post infection using 0.5 mg ml’ G418 (Gibco, Life 
Technologies). To generate resistant MLL-AF9;Nras@!7? murine leukaemia, cells 
were transduced with retrovirally delivered shRNAs or cDNAs at a transduction 
efficacy between 5-15% (predominantly yielding single viral integrations per cell). 
JQ1 treatment was started post infection and medium was carefully exchanged every 
2 days to maintain a 50 nM JQ] concentration until cells were proliferating normally 
and daily passaging was required. The starting point of JQ] treatment is critical and 
can vary between different targets and shRNAs since the effect is dependent on 
shRNA potency, protein half-life of the respective target and the half-life of the 
associated chromatin mark. In case of strong detrimental effects the optimal start 
of JQ1 treatment needs to be determined for every target. 

MOLM-13 and K-562 cells were modified to express the ecotropic receptor and 
rtTA3 using retroviral transduction of pMSCV-RIEP (pMSCV-rtTA3-IRES- 
EcoR-PGK-Puro) or lentiviral transduction of pWPXLd-RIEP (pWPXLd- 
rtTA3-IRES-EcoR-PGK -Puro) followed by puromycin selection (0.5 and 1 pg ml’, 
respectively, for 1 week). Derived cell lines were subsequently transduced with eco- 
tropically packaged retroviruses. 

For flow cytometry immunophenotyping, cultured cells were collected and 
stained using the indicated antibodies (two million cells in 100 kl FACS buffer: 
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PBS, 5% FCS, 0.005% sodium acetate, and human or mouse TruStain fcX Fc- 
receptor blocking agent (Biolegend) for 20 min at 4 °C. Antibodies were diluted 
1:400. Stained samples were analysed on an LSR Fortessa (BD) flow cytometer. 
Data analysis was performed using FlowJo software (Treestar). Cell viability assays 
for drug ICs and Glso determination were performed using Alamar blue staining 
according to manufacturer’s guidelines, or by counting the increase in viable cell 
numbers over 72 h in the presence of different JQ1 concentrations. Dead cells were 
excluded using propidium iodide (PI) staining. Measurements of cell concentra- 
tion were performed on a Guava easyCyte (Millipore), gating only viable cells 
(FSC/SSC/PI ). Synergy was evaluated using Chou-Talalay Combination Index 
(CI), calculated with the CompuSyn program (ComboSypn, Inc.)*’. Glo values for 
all tested cell lines are provided in Supplementary Information; JQ1 profiled cell 
lines. 

Competitive proliferation assays using shRNAs in pLMN, pLENC, pLMP or 
pRT3GEN-mir30/RT3GEN vectors were performed as described previously’”. 
Briefly, for assays involving Tet-regulated shRNA expression vectors, Tet-on 
competent MLL-AF9;Nras@?? mouse AML cells” or Tet-on competent deriva- 
tives of human leukaemia cell lines were transduced with the respective plasmid at 
infection rates of <5% to ensure single-copy integration. For constitutive shRNA 
expression (pLMN) cells were analysed for GEP* expression 1 day post infection. 
pRT3GEN transduced cells were selected for 7 days using G418 (0.5mg ml’), 
mixed with 10% uninfected cells, and shRNA expression was induced through 
addition of doxycycline (DOX) to a final concentration of 1 :g ml” '. The percent- 
age of shRNA-expressing cells (turboGFP* or GFP*) was measured daily using 
flow cytometry. All values were normalized to day 1. Once the percentage of viable 
cells was below 20%, measurements were discontinued (indicated in the graph by 
the discontinuation of the respective sample). Sequences for all shRNAs used in 
single assays are provided in Supplementary Information; shRNAs single assays. 

Colony-forming assays were performed in Methocult (StemCell Technologies, 

Cat. No. M3231). 1000 AML cells (in 100 pl standard medium) were added to 
900 pl Methocult supplemented with 1 pl of DMSO or JQ] (to a final concentra- 
tion of 200 nM). After 7 days the types of colonies were enumerated and normal- 
ized to input. 
Gene expression and protein level analysis. RNA was prepared using the RNeasy 
Plus Mini Kit (Qiagen). Synthesis of cDNA was performed using SuperScript III 
Reverse Transcriptase (Invitrogen) or Taqman reverse transcription kit (Applied 
Biosystems) with random hexamers. Quantitative PCR analysis was performed on 
an ABI 7900HT with SYBR green (ABI). All signals were quantified using the AC; 
method and were normalized to the levels of B-actin or GAPDH. All primers used 
in this study are presented in Supplementary Information; Primer. 

For immunoblotting of Suz12, histone H3K27me3, BRD4, TCF4, IGF2BP1 and 
MYC, 20 pg of whole-cell lysate (lysis buffer: 50 mM TRIS pH 8, 250 mM NaCl, 
0.5% NP-40, 5mMEDTA) were loaded onto each lane. Protein extracts were 
resolved by SDS polyacrylamide gel electrophoresis (SDS-PAGE) and transferred 
to nitrocellulose for blotting. 

Preparation of RNA-seq libraries. RNA from human cell lines or mouse AML 
cells was isolated using the RNeasy Plus Mini Kit (Qiagen). Messenger RNA was 
obtained using two rounds of poly(A) selection using the Dynabeads mRNA 
purification kit (Invitrogen) and subsequently fragmented by incubation at 
94 °C for 3 min (fragmentation buffer: 40 mM Tris base adjusted to pH 8.2 with 
glacial acetic acid, 100mM potassium acetate, 30 mM magnesium acetate in 
DEPC-treated H,O). Cell lines infected with shRNA expressing vectors were 
FAC-sorted for shRNA expression (GFP *) before RNA extraction. The fragmen- 
ted mRNA was used as template for first-strand cDNA synthesis with random 
hexamers and a NEBNext DNA Library Prep Master Mix Set for Illumina 
Synthesis kit (Invitrogen). The second-strand cDNA was synthesized with 
100mM dATP, dCTP, dGTP and dUTP in the presence of RNase H, E. coli 
DNA polymerase I and DNA ligase (Invitrogen). The incorporation of dUTP 
allowed elimination of the second strand during library preparation (described 
below) and thus preservation of strand specificity. 

Chromatin immunoprecipitation and ChIP-seq library construction. ChIP 
assays were performed exactly as described" in two to three independent bio- 
logical replicates. Crosslinking was performed with sequential EGS and formalde- 
hyde or with formaldehyde alone. All samples were quantified by quantitative PCR 
performed using SYBR green (ABI) on an ABI 7900HT after crosslink reversal. 
Each immunoprecipitate signal was referenced to an input standard-curve dilu- 
tion series (immunoprecipitate/input) to normalize for differences in starting cell 
number and for primer amplification efficiency and normalized to a control 
region. Two independent biological replicates were performed for each ChIP- 
seq experiment. 

For ChIP-Seq library construction, 1 X 10’ leukaemia cells were crosslinked 
using 1% formaldehyde for 20 min at room temperature. After purifying immuno- 
precipitated DNA using QIAquick Gel Extraction Kit (QIAGEN), ChIP-seq lib- 
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raries were constructed using the TruSeq ChIP Sample Prep Kit (illumina) fol- 
lowing manufacturer’s instructions with the following exception: following 
adaptor ligation, libraries were amplified for 15 cycles. In brief, purified ChIP 
DNA was repaired to blunt ends followed by dA-tailing process to ligate adapters 
for multiplex-based sequencing. Size-selection of adaptor-ligated ChIP-DNA was 
performed with agarose-gel-based selection of DNA size ranged from 200 to 
350 bp. The final libraries were amplified for 15 cycles, and size-selected with 
SPRI clean-up by using AMPure XP beads (Beckman Coulter). Library quality 
was determined using a Bioanalyzer (Agilent). Libraries sizes ranged from 250 to 
300 bp. ChIP-seq libraries were sequenced using an Illumina HiSeq 2000 platform. 
Barcoded libraries were sequenced in a multiplexed fashion with two to six lib- 
raries at equal molar ratio, with single-end reads of 50 bases. 

Illumina deep sequencing. Two to five nanograms of cDNA or DNA precipitated 
by ChIP was used as starting material for the generation of single-end sequencing 
libraries as described in Illumina’s ChIP-seq sample preparation protocol. DNA 
fragments of the following sizes were selected for these experiments: 200-350 bp 
for ChIP-seq, 150-700 bp for RNA-seq. For strand-specific RNA-segq, the uridine 
residues present in one cDNA strand were digested with uracil-N-glycosylase 
(New England Biolabs), followed by PCR amplification. Completed libraries were 
quantified on a Bioanalyzer using the dsDNA 1000 assay kit (Agilent) and the 
qPCR NGS library quantification kit (Agilent). Cluster generation and sequencing 
was performed using a HiSeq 2000 system with a read length of 50 or 100 nucleo- 
tides according to the manufacturer’s guidelines (Illumina). 

Analysis of RNA-seq data. The calculation of RNA expression values was based 
on the RefSeq database, which was downloaded from UCSC on 10 January 2014. 
Genes with overlapping exons were flagged and double entries (that is, exactly the 
same gene at two different genomic locations) were renamed. Identical genes with 
more than one assigned gene symbol were flagged. Genes with several transcripts 
were merged to consensus genes consisting of a union of all underlying exons 
using the fuge software (I. Tamir, unpublished), which resulted in 25,098 gene 
models. Gene names and accession numbers of identical genes are presented in 
Supplementary Information; refseq.hg19.2014_0110.imp.map. 

Paired-end and single-read fragments were trimmed on their 5’ end (six cycles 
PE100, two cycles SR50 NEB RNA sample prep protocol), adaptors were removed 
using cutadapt v1.4.2" and reads with a length of less than 18 bp in any read of the 
pair were discarded. The trimmed and adaptor-free reads were aligned against 
rDNA of the respective organism using bowtie2 v2.0.2” for paired-end and bowtie 
v0.12.5* for single-end reads. The rRNA cleaned paired-end reads were aligned 
against the transcriptome to estimate the fragment size and the standard deviation 
of the fragment size using bwa v0.6.2“. The rRNA cleaned reads were aligned to 
the genome with the TopHat splice junction mapper for RNA-seq reads**. The 
uniquely aligning reads were used for counting per gene with htseq-count v0.6.1p*° 
with the overlap-resolution mode option set to ‘union’ using the processed RefSeq- 
annotated gene database. Reads per kilobase per million mapped reads (RPKM) 
values for all samples were calculated as in ref. 47. Read counts for all samples were 
normalized using the median-of-ratios method implemented in DESeq2, 
v1.2.10**. Counts were transformed for PCA using DESeq2’s variance stabilizing 
transformation. For both the resistant and sensitive groups, three independent 
RNA-seq experiments on different leukaemia cell lines were available. The edgeR 
R package v3.4.2” was used to calculate the significance of difference in expression 
of resistant and sensitive leukaemia cell lines (DMSO treated) in Fig. 4a. For this, 
raw counts were normalized using the ‘relative log expression’ method analogous 
to DESeq2. Dispersions were sequentially estimated with the default settings by 
first estimating a common dispersion for the gene set and then ‘shrinking’ tag-wise 
dispersion values towards that dispersion trend. Statistical modelling and analysis 
was done by fitting a model taking the separation into sensitive and resistant cell 
lines into account. Differential expression of genes between those two groups was 
subjected to statistical analysis and resulting P values were adjusted for multiple 
testing with the Benjamini-Hochberg correction®’. The resulting list was further 
filtered by only considering genes which all have higher normalized counts in the 
sensitive cell lines versus the resistant cell lines or vice versa. 

Gene set enrichment analysis. Gene set enrichment analysis” was performed 
using GSEA v2.07 software with 1,000 gene set permutations. KEGG signalling 
gene sets and Wnt signalling associated gene sets were obtained from the 
Molecular Signatures Database v4.0 (MSigDB, http://www.broadinstitute.org/ 
gsea/msigdb/index.jsp). To perform GSEA with human gene signatures, 
human-mouse homologues from the mouse gene sets were identified and con- 
verted into human gene names using the NCBI homologene database (build68). 
Ambiguous homologue mapping, that is, different mouse gene names pointing to 
the same human homologue, was excluded. A detailed description of GSEA meth- 
odology and interpretation is provided at http://www. broadinstitute.org/gsea/doc/ 
GSEA UserGuideFrame.html. In brief, the normalized enrichment score (NES) 
provides “the degree to which a gene set is overrepresented at the top or bottom 
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of a ranked list of genes”. The nominal P value (Nom p-val) describes “the 
statistical significance of the enrichment score”. The false discovery rate q value 
(FDR q-val) is “the estimated probability that a gene set with a given NES repre- 
sents a false positive finding”. In general, given the lack of coherence in most 
expression data sets and the relatively small number of gene sets being analysed, 
an FDR cut-off of 25% is appropriate. All gene sets used in this study are provided 
in Supplementary Information; GSEA gene sets. 

ChIP-seq data analysis. Adaptors of the single-read fragments were removed 
using cutadapt v1.4.2*' and reads with a length of less than 18bp in any read of 
the pair were discarded. The remaining reads were aligned to the genome (mouse: 
mm10, human: hg19) using Bowtie v1.0.0. Reads with same start and end posi- 
tion on the same strand were removed from the alignment. To identify ChIP-seq 
peaks, we used the MACS2 v2.1.0.20141030 (https://github.com/taoliu/MACS/) 
peak finding algorithm. A threefold enrichment relative to input control samples 
was used for peak calling as well as the option to call broad peaks. Building a 
shifting model was disabled and the small nearby and large nearby region para- 
meters were set to 5,000 and 20,000 respectively. The extension size was set to the 
respective median insert size of the ChIP-seq treatment sample for paired-end data 
and the estimated fragment size for single-end data. Read numbers for the result- 
ing peaks were quantified using the BEDTools suite*' and normalized to total 
mapped reads. The Brd4 ChIP-seq track was obtained from published data sets’? 
(GEO sample GSM1262345). Peaks were assigned to modified versions of the 
RefSeq database (GRCh37/hg19 downloaded on 10 January 2014 from UCSC 
Table Browser and GRCm38/mm10 downloaded on 30 June 2014 from UCSC 
Table Browser) by assigning peaks to the closest upstream/downstream TSS using 
BEDTools. For these versions, overlapping genes were flagged and double entries 
(that is, exactly the same gene at two different genomic locations) were renamed. 
Identical genes with more than one assigned gene symbol were flagged. For the 
density heat maps, ChIP-seq peaks were identified using the MACS version 
1.4.0beta (Model based Analysis of ChIP-Seq) peak finding algorithm™. A P value 
threshold of enrichment of 1 X10~°, a false discovery rate (FDR) of less than 1%, 
and a fivefold enrichment relative to input control samples were used for all peak 
calling. The Brd4 peaks were considered as promoter if the peak showed at least 
1 bp overlap within +200 bp of RefSeq gene TSSs. If Brd4-called peaks showed no 
overlap within +200 bp of RefSeq gene TSSs, they were considered as enhancer- 
bound peaks. Heat map matrices were created by counting tags using a 10kb 
window (+5 kb of the peak summit) and 20 bp bin size. Further visualization of 
heat map matrices were done using Java TreeView 1.1.6r4 (http://jtreeview.sour- 
ceforge.net). For comparing DMSO and JQ] treated ChIP-seq data sets, heat map 
matrices were normalized to total mapped reads counts before visualization with 
Java TreeView. 

STARR-seq screens. STARR-seq screens were done as described previously”, 
with the following exceptions. DNA from 21 bacterial artificial chromosomes 
(BACs), which were available for the extended MYC locus covering approximately 
91% of the surrounding 3 Mb of the genomic region, plus 25 genic control BACs, 
were used for library generation. The fragmented BAC DNA was cloned into the 
human STARR-seq screening vector” containing a minimal MYC promoter. We 
transfected 500 jig of this library into 1 X 10°K-562 cells per condition, using a 
BTX Agile Pulse MAX system (Harvard Apparatus) following the manufacturer’s 
protocol. Electroporation conditions: two pulses at 750 V for 0.5 ms (100 ms inter- 
val), followed by 5 pulses of 250 V for 2 ms (100 ms interval) in BT'Xpress cyto- 
poration medium T. After electroporation, the cells were grown at 37 °C in RPMI- 
1640 containing either DMSO or 250 nM JQ] and were harvested after 24h. For 
each screen we isolated poly(A) RNA and processed it as described before”, with a 
different extension time of 45 in the final amplification of the cDNA and modi- 
fied primers for the first CDNA amplification. All BACs used for the STARR-seq 
screen are provided in Supplementary Information; STARR-seq BACs. 
STARR-seq deep sequencing and analysis. All sequencing was performed as a 
50-cycle paired-end run on an Illumina HiSeq2000 machine. STARR-seq and 
input read processing was performed as described”, with the following exceptions. 
Libraries from different conditions were normalized to each other by subsampling 
to 10° reads per condition using reservoir sampling. Peaks were called using macs2 
(v2.1.0) with the input library as background. The peaks were then filtered for a 
minimal fold enrichment of three and only peaks with at least five reads per kb 
were considered for further analysis. Peak calling was performed for STARR-seq 
reads from both conditions (DMSO and JQ1) and the resulting peaks were merged 
using bedtools merge*’. STARR-seq enrichment was calculated using bedtools 
coverage”. Differential fold enrichment was calculated as the ratio between 
enrichments in JQ1 over DMSO and expressed in log). 

Luciferase assays and data analysis. Luciferase assays were done as described 
previously”, with the following exceptions. Enhancer candidate regions were 
amplified from genomic DNA and shuttled into a modified pGL4 luciferase 
reporter vector containing a minimal MYC promoter (chr8:128,748,440- 


128,748,550). Each construct was tested by co-transfecting 10° K-562 cells with 
45 1g of the firefly luciferase construct and 5 Wg of a renilla luciferase control 
plasmid (pGL4.75, Promega). After electroporation, the cells were grown at 
37°C in RPMI-1640 containing either DMSO or 250 nM JQ] and were harvested 
after 46h. Relative luciferase signals (firefly/renilla) were normalized to signal 
from two negative background sequences chosen based on their absence of 
STARR-seq signal and expressed as fold change over background. We analysed 
the statistical significance of differential luciferase signals with or without JQ1 
treatment using an unpaired t-test with Welch’s correction for three biological 
replicates. Primer sequences which were used to amplify the particular enhancer 
regions are deposited in Supplementary Information; Primer. 

Animal experiments. Mouse MLL-AF9;Nras@!? leukaemia cells (RN2) were 
transplanted by tail-vein injection of 1 X 10° cells into sub-lethally (5.5 Gy) irra- 
diated female B6/SJL (CD45.1) recipient mice at the age of 6-10 weeks (n = 5, for 
each individual experiment). For whole-body bioluminescent imaging, mice were 
intraperitoneally injected with 50 mg kg”! p-luciferin (Goldbio), and after 5 min 
were analysed using an IVIS Spectrum system (Caliper LifeSciences). For JQ1 
treatment trials, a stock of 100mg ml‘ JQ1 in DMSO was 20-fold diluted by 
dropwise addition of a 10% 2-hydroxypropyl-B-cyclodextrin carrier (Sigma) 
under vortexing, yielding a 5 mg ml * final solution. Mice were intraperitoneally 
injected daily with freshly diluted JQ1 (50 mgkg ') or a similar volume of carrier 
containing 5% DMSO. JQ] treatment was started 3 days after transplantation of 
RN2 cells stably transduced with pLMN shRen.713 and shSuz12.1676. Mice trans- 
planted with RN2 cells expressing B-catenin cDNA or empty vector were treated 
with JQ] daily starting 1 day after transplantation. All animals were maintained in 
the pathogen-free animal facility of the Research Institute of Molecular Pathology, 
and all procedures were carried out according to an ethical animal license, which is 
approved and regularly controlled by the Austrian Veterinary Authorities. No 
statistical methods were used to predetermine sample size. The investigators were 
not blinded to allocation during experiments and outcome assessment. 
Randomization was not applied because all animals used in this study were similar 
for age, sex and strain background. 

Patient sample analysis. Ten patients with AML (females, n = 4; males, n = 6) 
and two with CML blast phase (BP) (males, n = 2) were examined. Diagnoses were 
established according to the proposal of the French-American-British (FAB) 
cooperative study group*** and the classification of the World Health 
Organization (WHO)”. Informed consent was obtained before bone marrow 
puncture in each case. The study was approved by the Institutional Review 
Board (Ethics Committee) of the Medical University of Vienna. Mononuclear 
cells (MNC) were isolated using Ficoll and stored in liquid nitrogen until used. 
After thawing, the viability of cells ranged from 75% to 85% as assessed by trypan 
blue exclusion. RNA was isolated from bone marrow or peripheral blood MNC 
using the RNeasy MinElute CleanupKit (Qiagen). cDNA was synthesized using 
RQ1 DNase buffer, RQ1 DNase and RQ1 DNase Stop Solution (Promega), 
Moloney murine leukaemia virus reverse transcriptase, random primers, First 
Strand buffer, dNTPs (100 mM), and RNasin (Invitrogen) according to the man- 
ufacturer’s instructions. Details of the patient material used is deposited in 
Supplementary Information; Patient information. 

MNC were cultured in 96-well plates (TPP) in RPMI-1640 medium (Lonza) 

with 10% fetal calf serum in the absence or presence of JQ1 (50-5,000 nM) at 37 °C 
for 48h. AML MNC were incubated with JQ1 in the presence of 100 ng ml! 
recombinant human (rh) G-CSF (Amgen), 100 ngml! rhSCF (Peprotech) and 
100 ng ml‘ rhIL-3 (Novartis Pharma AG). CML BP cells were incubated with 
JQ1 without cytokines. After incubation, 0.5 tCi 3H-thymidine (Perkin Elmer) 
was added for 16h. Cells were then harvested on filter membranes in a Filtermate 
196 harvester (Packard Bioscience). Filters were air-dried and the bound radio- 
activity was measured in a Top-Count NXT -counter (Packard Bioscience). All 
experiments were performed in triplicates. Proliferation was calculated as a per- 
centage of medium control, and the inhibitory effects of JQ1 were expressed as 
ICs values. 
Statistical analyses. Results are expressed as means + standard error of the mean. 
If not stated otherwise statistical significance was calculated by two-tailed unpaired 
t-test on two experimental conditions with P < 0.05 considered statistically signifi- 
cant. Statistical significance levels are denoted as follows: ****P< 0.0001; 
***P < 0.001; **P < 0.01; *P < 0.05. No statistical methods were used to predeter- 
mine sample size. 
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Extended Data Figure 1 | Multiplexed shRNAmir screening for chromatin- 
associated dependencies in AML maintenance and BET resistance. a, Scatter 
plot illustrating the correlation of normalized reads per shRNA at all three 
time points (T0, T1 and T2; top to bottom) compared to the plasmid pool in all 
three independent replicates (eft to right). The schematic to the right illustrates 
the different sampling time points subjected to deep sequencing. Top hits 
enriched under JQ] treatment and positive controls from the initial negative- 
selection screen (T1) are marked with coloured dots according to the legend on 
the right. b, Pooled negative-selection screening depicting changes in 
representation of 2,917 shRNAs during 7 days of culture. shRNA fold depletion 
values were calculated by dividing the number of reads after 7 days of culture 
(T1) by the number of reads obtained from the plasmid pool, and are plotted as 
the mean of three replicates in ascending order. Completely depleted shRNAs 


(Oreads at T1) obtained a fold depletion value of 1 X 10°. Positive control 
shRNAs targeting essential genes are marked in red; negative control shRNAs 
are depicted in green. c, Scatter plot depicting all genes ranked by the sum of 
their average depletion score of all shRNAs across all three replicates. Top 
scoring hits were defined as genes for which at least two shRNAs showed an 
average depletion of eightfold after 7 days of shRNA expression and are marked 
in red (45 genes). d, ICs9 determination for JQ1 in murine MLL-AF9;Nras@!?? 
AML cells (RN2). Obtained numbers of viable cells per ml were normalized to 
DMSO (n = 3, mean = s.e.m.). e, Table showing the top ten enriched 
shRNAs at T2. shRNAs targeting Suz12, Dnmt3a and Psip1 are strongly 
enriched in all three independent replicates. f, Relative mRNA abundance 
(RPKM) of shRNA target genes in JQ1-resistant leukaemia cells expressing the 
indicated shRNAs, plotted relative to leukaemia cells expressing Ren.713. 
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Extended Data Figure 2 | PRC2 suppression confers resistance to JQ1. 

a, Competitive proliferation assays of MLL-AF9;Nras@!”” leukaemia cells 
transduced with pLMN constructs expressing the indicated shRNAs. Shown is 
the fraction of GEP*/shRNA* cells (relative to the initial ratio) under 
treatment with JQ1 (50nM) or DMSO over time (continuation from Fig. 1d). 
JQ1 treatment was initiated at the indicated time points (red arrow). 

b, Competitive proliferation assays evaluating one validated shRNA 

targeting Eed as an additional core component of the PRC2 complex 

(as in a). c, Competitive proliferation assays of Tet-on competent MLL- 
AF9;Nras@!7P leukaemia cells expressing the indicated shRNAs from a 
Tet-inducible vector (pRT3GEN-miR30). Transduced cells were selected with 
G418 and subsequently mixed with wild-type (wt) cells in a ratio of 95% to 5%. 
shRNA-expression was induced using dox treatment (1 jig ml’; from day 
0), and the fraction of GEP* cells was measured over time and plotted relative 
to day 2. JQ1 treatment (50 nM) was initiated in one of the duplicate samples 
at the indicated time point (red arrow). Once the percentage of viable cells 
was below 10%, measurements were discontinued (indicated in the graph by 
the discontinuation of the respective sample). The negative effect induced by 
Suz12 suppression is reverted upon treatment with JQ1, which is not the 

case when Myc is suppressed. d, Bar chart showing colony-forming cell 


frequencies of MLL-AF9;Nras*'”” leukaemia cells expressing Ren.713 or 
Suz12.1676 shRNAs in the presence of DMSO or 200 nM JQ]; type 1, 
myeloblasts; type 2, maturing myeloblasts; type 3, terminally differentiated 
myeloid cells (n = 3, mean + s.e.m.). e, Pie charts depicting the fraction of JQ1- 
response genes which are re-expressed to the indicated extent in mouse 
resistant AML cells expressing shRNAs targeting Dnmt3a and Psip1 
(continuation of Fig. 2b). JQ1-response genes (Fig. 2a) were grouped into four 
categories based on the divergence of their expression in resistant AML 
compared to AML expressing Ren.713 (not changed compared to expression 
after 24h JQ1, less than 1.5-fold; restoration relative to DMSO control: full 
restoration, less than 1.5-fold; enhanced, above 1.5-fold; partial restoration, 
restored but less than 1.5 fold) f, Heat map showing the distribution of 
H3K27me3, H3K36me3 and H3K4me3 ChIP-seq peaks (fold enrichment >5 
over input, FDR <1%) relative to Brd4 enhancer and promoter binding sites 
in MLL-AF9;Nras@!?” AML cells with and without JQ1 treatment. g, ChIP-seq 
occupancy profiles of Brd4 and H3K27me3 and H3K36me3 chromatin 
marks at enhancer regions downstream of Myc and upstream of Tifab following 
3 days of treatment with vehicle or JQ1 (25 nM). y axis reflects the number of 
normalized cumulative tag counts in each region. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


es | 2G | |OD 


ee |, US}| 

ee" WUSLIEOVEESP 

LWd 

ee GXOS 
es £CC7 
es CXLUNY 
es | DOWUS 
a CEPUe4 

+ nN 7 ° 


top 15 gained enhancer 
[associated genes, closest TSS] 


vt N fo) N vt Ke} {oe} 


oa = [zo] epzuey sa Zy8L ZL zns O4 


Lor ON 

— | — L1zpel'zizns 
S| 3S 
SLs cei oA 
er ge 11@pgtzizns 
Vv Vv 
a Q) 

r—||4 eLzuey 
+ o a = Oo ~ 
[or 601] Way 


H2afy3 


Pvt 


° 
> 
= 


f 
0.15 


H3K27ac 
$uz12.1842 LT 


i 


Myc promoter 


. 


. 


. 


. 


. 


Pvt1 enhancer 


. 


Pvt1 promoter 


0.12 


Mi lgG 


§) H3K27ac 


p=ns 


jndul % 


Oo 


aa 


e€L,Lued 


a 


e€LL ued 


e€LL ued 


KEGG_TGF_BETA_PATHWAY 


ST_WNT_BETA_CATENIN_PATHWAY 


KEGG_WNT_SIGNALING_PATHWAY 


0.11 


NES = 1.4 


RNAi Control 


FDR q-value 


UL 


RNAi PRC2 


| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
fo] ° ° 


[Sa] e109s uswYdUS 


0.16 


NES = 1.5 


7 
RNAi Control 


FDR q-value 


RNAi PRC2 


[Sa] a100s uswYdUS 


0.11 


NES = 1.5 


RNAi Control 


FDR q-value 


LI 
RNAi PRC2 


i=) Oo 

a 

Oo =) 

[Sa] e100s JuswYaUS 


-BETA_Signaling 


KEGG_TGF-: 


_ Signaling 


KEGG_WNT_Si 


6AAAAZ 

SAcA 
cudWwa 
Ml GCOVWS 
QLAAAAZ 
abudddd 

aéNyXdo 

cee 
[a rele} 
LSAHL 
LydVW 


cdl 
cayNWS 
vavWs 


LdxS 
Mmm azuAOV 
— OOEddd 
a 602 5 
Ms COLVIN 
“Xyud 
Mes | DISNNLO 
es | OLVAN 
Mum CZ 
“abu@ddd 
ls CONDO 
‘vovmuud 
Ml VO 
ls SSLNM 
“LWZ4NSO 
“LaNNLO 
“%dNAS 
aia 
Kovrele)s| 
9ZNVO 
“LHVIS 
ere 
Ma vows 
“lds 
“aomud 
“bYEddd 
XETEL 
“aeMso 
Ms }MOldd 
Ml LVLINSO 
“6d VN 
“d@Aovo 
Ma IG LN 
“9dy1 
“WOLLNM 
‘WeyNvo 
"AGUZddd 
“Love 
a>:ele):| 
Ma ZLNM 
IN 


ot ON + CO 


MB H3K27me3 


[z6o|-] sy Buluuns 


150kb 


10kb 


10kb 


f 


Prkca 


Fzd5 


Fzd2 


i=] 
ie) 


[Wdul 
eew/ZEH 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


Extended Data Figure 3 | Changes in the regulatory landscape of BET- 
inhibitor-resistant mouse AML cells. a, Global H3K27 acetylation density of 
Suz12.1842-expressing resistant MLL-AF9;Nras@? leukaemia cells under 
long term (LT) treatment with 50nMJQI1 (red bar), after 4 days of drug 
withdrawal (orange bar) and in Ren.713 controls (blue bar; statistical 
significance determined using Student's t-test). b, Left panel, sorted fold change 
(FC) ratios of H3K27ac peaks in long-term JQ1-treated MLL-AF9;Nras@!” 
leukaemia cells expressing Suz12.1842 compared to cells expressing Ren.713 
control shRNA. Included were all peaks showing > 10 reads per million in at 
least one condition. Right panel, top 15 gained peaks and their associated genes 
(defined using the closest transcription start site, TSS). The Myc proximal 
enhancer in the first intron of Pvt1 is highlighted in red as one of the most 
differentially enriched peaks (FC=4.18). c, RT-PCR validation of presented 


H3K27ac ChIP-seq at the indicated regions downstream of the Myc locus 

(n = 3, mean + s.e.m., statistical significance determined using Student’s 
t-test). d, Gene set enrichment analysis plots of three publicly available gene sets 
associated with signalling pathways comparing expression changes in resistant 
MLL-AF9;Nras“'*” AML cells induced by suppression of PRC2 complex 
members, compared to control cells expressing Ren.713 shRNA (n = 2) or 
empty vector (continuation of Fig. 2e, f). e, Core signature genes of KEGG- 
curated Wnt and TGF-B gene sets with increased expression in resistant murine 
MLL-AF9;Nras@”? cells, compared to sensitive cells. Red coloured bars 
indicate association with H3K27 methylation in JQ1-sensitive MLL- 
AF9;Nras@!?? AML cells. f, H3K27 methylation density at three exemplified 
genes with high expression changes in JQ1-resistant murine AML. 


©2015 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 
3100 100 100 = Suz12.1842 LT 
E 80! 80 80 = $uz12.1676 LT 
2 60! 60 60 m Ren.713 
N 404 40 40 
[vj 
E 20/ 20 20 
20 aae2 | <0 0} i) 
0 200 400 600 800 1.0K 10° 10' 102 10° 104 10° 10' 10? 10° 
FSC Mac-1 Gr-1 
b 
Ren.713 $uz12.1676 
c 


LSC_Signature_Somervaille 


Macrophage_Development_IPA 


a 0.20 FDR q-value = 0.957 a 0.30 FDR q-value = 0.81 
du NES = 0.7 LL 4620 NES = 1.0 
@ 0-10 3 
S 4 8 0.10 
n n 
& -0.10 g ° 
= 6.20 £ -0.10 
2 aS) 
oO oO 
HU 
shPRC2 shCtrl shPRC2 shCtrl 


Extended Data Figure 4 | Resistant MLL-AF9;Nras@!2? AML cells 
generated through Suz12 suppression are not enriched for LSC-associated 
surface markers or expression signatures. a, Immunophenotyping of JQ1- 


cells isolated from terminally diseased CD45.1* mice following transplantation 
with MLL-AF9;Nras@!”” cells expressing Ren.713 or Suz12.1676 and in vivo 
treatment with DMSO carrier or JQ1 (50 mg kg! per day) (n = 5, mean + 


resistant mouse AML cells expressing two independent Suz12 shRNAs 

stably cultured in 50 nM JQ1 for more than 4 weeks (LT) compared to cells 
expressing Ren.713 control shRNA. Data are representative of two independent 
biological replicates. b, Percentage of c-Kit* cells in CD45.2* bone marrow 


s.e.m.). ¢, Gene set enrichment analysis evaluating changes in macrophage 
and LSC gene signatures in resistant MLL-AF9;Nras@!7? AML expressing 
three PRC2 shRNAs compared to cells expressing Ren.713 shRNA (n = 2) 
or empty vector (see also Fig. 2e, f). 
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Extended Data Figure 5 | Comparison of JQ1-response genes in sensitive 
and resistant cancer cell lines. a, JQ1 sensitivity profiling in 246 human cancer 
cell lines of different tissue contexts. Shown are Gls) values determined 

using Alamar blue staining after 72h. b, Principal component analysis of 
steady-state transcriptomes (based on RPKM) and ¢, transcription changes 
(based on fold change, FC) following 2h of JQ1 treatment (200 nM) in 
indicated sensitive and resistant cancer cell lines of different tissue context. 
Steady-state profiles cluster based on tissue context, while neither baseline nor 
dynamic expression analysis can accurately distinguish sensitive and resistant 
contexts. d, MYC mRNA levels (RPKM) at different time points after JQ1 
treatment (200 nM) in indicated cell lines, relative to levels in DMSO-treated 


LETTER 


cells. Individual cell line pairs are grouped for their tissue context and coloured 
according to their sensitivities (green, sensitive; red, resistant). e, Number of 
genes twofold up- or downregulated upon JQ1 treatment (200 nM) after 2h 
and 24h (minimum expression >3 RPKM). f, Pairwise overlap of commonly 
up- or downregulated genes after 2h of JQ1 treatment (200 nM) relative to 
DMSO control. Each cell colour corresponds to the relative number of 
commonly up- or downregulated genes in the cell lines listed in the respective 
row and column. The total number of genes regulated per cell line is 
indicated in black next to each cell line. Only little overlap is observed between 
cell lines of the same tissue context as well as between JQ1-sensitive or -resistant 
cell lines. 
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Extended Data Figure 6 | shkRNA-based analysis of BRD2/3/4-dependent 
target genes and detailed analysis of effects on HEXIM1 and MYC 
transcription in different cell lines. a, Determination of knockdown levels ofa 
set of BRD2, BRD3 and BRD4 shRNAs determined using an established 
fluorescence-based shRNAmir reporter assay''. Knockdown levels were 
quantified relative to Ren.713 control shRNA. b, Competitive proliferation 
assays in MOLM-13 cells functionally evaluating the potency of BRD2, BRD3 
and BRD4 shRNAs over time using a Tet-regulated vector (pRT3GEN) in 
presence of doxycycline. Only shRNAs targeting BRD4 induce a proliferative 
disadvantage and lead to rapid depletion of GFP* cells over time. Red labels 
indicate most potent shRNAs based on results obtained from reporter and 
competitive proliferation assay. c, Determination of BRD2, BRD3 and BRD4 
mRNA levels in K-562 and MOLM-13 cells expressing the indicated shRNA. 
d, Number of genes commonly up- or downregulated with a fold change 
(FC) >3 in MOLM-13 or K-562 cells following expression of indicated 
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validated shRNAs or treatment with 200 nM JQ] for 24 h. JQ1-induced 
expression changes show the largest overlap with cells expressing a validated 
BRD4 shRNA, suggesting that suppression of BRD4 is the key effector 
mechanism of JQ] in leukaemia. e, Venn diagrams showing the overlap of 
genes commonly up- or downregulated following 2 h and 24h of JQ] either in 
all contexts, or specifically in sensitive or resistant leukaemias. f, HEXIM1/ 
Hexim1 expression (RPKM) in all 18 analysed human cell lines and murine 
MLL-AF9;Nras@!2? AML cells after 2h and 24h treatment with 200 nM JQl, 
compared to DMSO control (statistical significance was determined using a 
paired Student's t-test). g, Relative MYC mRNA levels determined by qRT- 
PCR quantification in the indicated cell lines after incubation with 200 nM JQ1 
measured at different treatment time points. Cell lines are grouped according to 
their sensitivity and the respective IC;9 is presented below. Resistant cell lines 
rapidly restore MYC transcription (n = 3, mean + s.e.m.). 
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Extended Data Figure 7 | Dynamic STARR-seq analysis of enhancer activity 
in K-562 cells. a, Schematic representation of the STARR-seq cloning and 
screening strategy in K-562 cells. BACs available for the extended MYC locus 
(covering approximately 91% of a 3.1 Mb region at the MYC locus) and 25 
genic control BACs were fragmented and cloned into a modified STARR-seq 
vector containing a minimal MYC promoter. This library was then screened 
for enhancer activity using STARR-seq in K-562 cells with or without 

250 nM JQI1. The schematic shows the underlying principle of STARR-seq: a 


enhancer 


bona fide enhancer can activate its own transcription from a minimal MYC 
promoter. Messenger RNA corresponding to active enhancer elements will 
therefore become more abundant among the cellular RNA compared to 
inactive fragments. b, PVT1 mRNA levels (RPKM) at different time points after 
JQ1 treatment (200 nM) in indicated cell lines, relative to levels in DMSO- 
treated cells. PVT1 expression is generally reduced upon JQ1 treatment 
indicating no association with enhancer activation. 
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Extended Data Figure 8 | Analysis and functional validation of Wnt 
signalling as a key driver of BET resistance. a, Protein levels of TCF4 and 
IGF2BP1 in K-562 cells transduced with pRT3GEN expressing indicated 
shRNAs after 7 days of doxycycline treatment, compared to Ren.713 and wild- 
type (wt) control samples. b, Competitive proliferation assay of JQ1-sensitive 
MOLM-13 cells expressing GFP-linked TCF4 cDNA or empty vector. 

Plotted is the relative fraction of GEP* cells 72 h after JQ1 treatment using the 
indicated doses (n = 3, means + s.e.m., ***P < 0.001; **P < 0.01; *P < 0.05 as 
determined by Student’s t-test). Cells overexpressing TCF4 exhibit a dose- 
dependent competitive advantage under JQ1 treatment. c, Protein levels of 
TCF4 after overexpression of TCF4 cDNA subcloned into pMSCV-PGK- 
NeoR-IRES-GFP in MOLM-13 after 4 weeks of G418 (0.5 mg ml ') selection, 
compared to MOLM-13 transduced with empty control vector and to TCF4 
protein levels in K-562 cells. d, ChIP qRT-PCR analysis of TCF7L2 binding to 
AXIN2, SP5, MYC promoter and the PVT1 enhancer element in K-562 cells at 
indicated time points after treatment with 200 nM JQI1 (n = 2 biological 
replicates, mean + s.e.m.). TCF7L2 binding increases gradually over time at 
promoters of Wnt target genes and the PVTI enhancer at the MYC locus. 


e, Protein levels of Apc in 3T3 murine fibroblast cells 7 days after infection with 
the indicated shRNAs cloned into pLMP compared to Ren.713 and wild-type 
control samples. f, Relative Axin2 mRNA expression levels, determined by 
qRT-PCR normalized to B2m, after expression of the indicated shRNAs 
targeting Apc. g, Competitive proliferation assays of MLL-AF9;Nras@!?? 
leukaemia cells transduced with pLMP constructs expressing shRNAs targeting 
Apc, in combination with JQ1 (50 nM) or DMSO over time (continuation from 
Fig. 4d). h, Top, bioluminescent imaging of mice transplanted with 1 x 10° 
MLL-AF9;Nras“'7? leukaemia cells expressing constitutively active Ctnnb1 
(Ctnnb1.4x). Treatment with JQ1 (50 mg kg! per day) or DMSO carrier 
started at day 1 after injection. Bottom, Kaplan-Meier survival curves of control 
and JQ1-treated mice demonstrate decreased survival rates in mice treated with 
JQI (n = 5). Statistical significance was calculated using the log-rank test. 

i, Competitive proliferation assays of BCR/ABL? 210-53. / ~ and MLL/ 
ENL;Nras@”” leukaemia cells transduced with constitutively active Ctnnb1 
(Ctnnb1.4x) or empty vector control in the absence or presence of JQ1. 
Measurements started 4 days after transduction together with JQ1 treatment 
(50 nM). Shown is the fraction of GEP* cells (relative to initial) over time. 
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Extended Data Figure 9 | Wnt signatures are generally associated with BET 
resistance and Wnt activation drives resistance of pancreatic cancer models. 
The BETi_Resistance_Rathert signature was defined by combining the core 
enriched genes from the two significant Wnt gene sets (KEGG_WNT_ 
SIGNALING and ST_WNT_SIGNALING) in murine resistant AML, filtered 
for significant upregulation (DESeq padj <0.1). These were combined with the 
Wnt-associated genes found differentially expressed in resistant human 
leukaemia cell lines and primary patient samples, resulting in a total of 26 genes. 
a, Left, microarray expression data of all 26 signature genes was curated from 
the Cancer Cell Line Encyclopedia (CCLE)”* and normalized to the geometric 
mean of each individual gene throughout all samples (relative expression). The 
sum of the relative expression of all genes (resistance index) was plotted for all 
CCLE cell lines showing a Gls) > 450 nM (resistant, 54 total) or a 

G59 < 150 nM (sensitive, 55 total) based on our sensitivity profiling (statistical 
significance was determined using Student’s t-test). Right, gene set enrichment 
analysis plot comparing the expression of 26 signature genes associated with 
JQI1 resistance across 54 resistant (GI;9 > 450 nM) and 55 sensitive cell lines 
(GI59 < 150 nM) available from CCLE. b, Left, as in a the resistance index was 
calculated as the sum of relative expression values of all 26 signature genes, 
which were based on RPKM extracted from an independent RNA-seq data 
set’’. Plotted are all 49 resistant (GI59 > 450 nM) and all 50 sensitive 

(GI59 < 150 nM) cell lines analysed in both sensitivity profiling and RNA-seq”” 
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(statistical significance determined using Student’s t-test). Right, gene set 
enrichment analysis plot comparing the expression of 26 signature genes 
associated with JQ] resistance across 49 resistant (Gls59 > 450 nM) and 50 
sensitive cell lines (GI59 < 150 nM) available from ref. 27. c, Competitive 
proliferation assays of murine pancreatic adenocarcinoma (KRPC2) cells 
transduced with pLEPC constructs expressing potent validated shRNAs 
targeting Apc, cultured in the presence of JQ1 (800 nM) or DMSO. Shown is the 
relative number of mCherry* cells over time, relative to initial. d, Cell viability 
of murine KRPC2 was determined following 5 days of treatment with 
pyrvinium and/or JQ] as indicated (n = 3, mean + s.e.m., statistical 
significance determined using Student’s t-test). The combination index (CI) for 
drug combinations was calculated using the CompuSyn software and 
percentage inhibition (fraction affected, Fa) resulting from combined action of 
the two drugs versus effects of either drug alone. CI values <1.0 indicate 
synergism of the two agents. e, As in d, cell viability was determined for human 
ASPC-1 pancreatic cancer cells following 5 days of treatment with pyrvinium 
and/or JQ1 as indicated (mn = 3, mean + s.e.m., statistical significance 
determined using Student’s t-test). The combination index for drug 
combinations was obtained using percentage inhibition (fraction affected, Fa) 
resulting from combined action of the two drugs versus effects of either drug 
alone. 
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Extended Data Figure 10 | Expression analysis of Wnt-associated genes in 
primary AML patient samples. a, Determination of JQ1 response profiles in 
12 primary AML patient samples. Sensitivity was determined using 3H- 
thymidine uptake across different JQ1 concentrations. b, RT-PCR analysis of 
mRNA levels of additional Wnt-associated genes in primary human 
leukaemia samples relative to GAPDH (continuation of Fig. 4g). Patient groups 
with low JQ1 ICs9 (<200 nM, blue dots) were compared to patients with high 
ICs9 (>500 nM, red dots). Statistical significance was determined using a 


Student’s t-test. c, Definition of a JQ1 resistance index. Expression of HOXB4, 
TCF4 and CCND2 in each primary AML patient sample was normalized 

to the geometric mean of all samples. The sum of these relative expression 
values of all three genes were added up to a resistance index, which was plotted 
in comparison to the JQ1 ICso of each sensitive (IC; < 200 nM, blue dots) 
and resistant (IC5) > 500 nM, red dots) AML patient sample. Two-tailed 
Pearson correlation coefficient r and P value are shown. 
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Crystal structures of a double-barrelled 


fluoride ion channel 


Randy B. Stockbridge’, Ludmila Kolmakova-Partensky’, Tania Shane', Akiko Koide’, Shohei Koide’, 


Christopher Miller’? & Simon Newstead* 


To contend with hazards posed by environmental fluoride, micro- 
organisms export this anion through F -specific ion channels of the 
Fluc family’ ~“. Since the recent discovery of Fluc channels, numerous 
idiosyncratic features of these proteins have been unearthed, includ- 
ing strong selectivity for F- over Cl” and dual-topology dimeric 
assembly”*. To understand the chemical basis for F” permeation 
and how the antiparallel subunits convene to form a F -selective 
pore, here we solve the crystal structures of two bacterial Fluc homo- 
logues in complex with three different monobody inhibitors, with 
and without F” present, to a maximum resolution of 2.1 A. The 
structures reveal a surprising ‘double-barrelled’ channel architec- 
ture in which two F” ion pathways span the membrane, and the 
dual-topology arrangement includes a centrally coordinated cation, 
most likely Na*. F~ selectivity is proposed to arise from the very 
narrow pores and an unusual anion coordination that exploits the 
quadrupolar edges of conserved phenylalanine rings. 

The fluoride anion, ubiquitous in the aqueous biosphere throughout 
evolutionary time, is a xenobiotic inhibitor of essential phosphoryl- 
transfer enzymes’. Unicellular organisms directly exposed to envir- 
onmental F counteract the toxicity of the anion through the action of 
F -exporting membrane transport proteins that keep cytoplasmic F” 
below inhibitory levels’**. Two recently discovered phylogenetically 
unrelated families of F- exporters carry out this task: the CLC" F/H* 
antiporters, a strictly bacterial clade of the CLC superfamily of anion 
transporters, and small-membrane proteins of the Fluc family (also 
known as CrcB or FEX). Fluc genes are also found in plants, fungi, 
primitive marine chordates, and sponges, but not in mammals. We 
recently established*® that Fluc proteins are ion channels with two 
unusual properties: an exceedingly high specificity (>10*) for F- over 
CI, and a dual-topology dimeric architecture, in which the two subunits 
forming the active channel associate in antiparallel transmembrane ori- 


entation. Dual-topology dimeric construction is known in small multi- 
drug transporters”, but has not been previously observed in ion 
channels. These inferences regarding function and structure of Flucs 
provoke fundamental questions about their mechanisms, such as (1) 
how the channel achieves such extreme selectivity for F, arguably the 
highest selectivity of any ion channel to our knowledge; (2) whether the 
protein contains a single pore on the subunit interface or two pores, one 
in each subunit, as in CLC Cl” channels’’; and (3) whether the channel 
homodimer is symmetrical, or the two subunits adopt different confor- 
mations, as in the multidrug resistance transporter EmrE™"’. 

In previous work, the antiparallel transmembrane topology of Fluc 
channels was intimated by the distribution of positively charged resi- 
dues in Fluc sequences’’, was strongly suggested by crosslinking 
and functional reconstitution’, and was established definitively by 
two-sided block of single channels by ‘monobodies’, engineered pro- 
teins selected as high-affinity binding partners from combinatorial 
libraries®. We used these monobodies in crystallization trials to form 
complexes with a Fluc channel from Bordetella pertussis, denoted Bpe’. 
Crystals could not be grown unless monobodies were present; 
however, diffraction to 3.6 A Bragg spacing was obtained with the 
monobody Mb(Bpe-S7) (ref. 6), hereafter denoted S7. A structure 
was solved, with phases initially determined by single-wavelength 
anomalous diffraction (SAD) of Bpe labelled with Hg at a unique 
cysteine residue, and improved with Hg-labelled selenomethionine- 
substituted samples (Extended Data Table 1). A view of the crystal 
lattice highlights the importance of the monobodies as crystallization 
chaperones, which exclusively mediate crystal contacts in all structures 
presented here (Extended Data Fig. 1). 

Although devoid of F” ions and at low resolution, the Bpe-S7 
structure reveals the overall architecture of the channel (Fig. 1a). 
Bpe is an antiparallel homodimer in which each subunit consists of 


Figure 1 | Bpe-S7 structure. a, Bpe 

homodimer, viewed parallel (centre structure) and 
perpendicular (right structure) to the membrane. 
Colouring is as indicated in the transmembrane 
helix schematic on the left. b, Through-membrane 
views, at the same orientation as in a, indicating 
the TM3 break (red) and the aqueous volumes 
within the vestibules (blue). c, Bpe with S7 
monobodies bound. Variable sequences of the 
monobodies (cyan) are shown with ribbon or mesh 
representation. A single-channel Bpe recording 
showing S7 (500 nM) binding and dissociation is 
shown below. Zero-current level is indicated by the 
dashed line. 
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four transmembrane helices (denoted TM), with an overall fold that is 
novel among membrane proteins. The 1,700 A* dimer interface is 
almost completely membrane-embedded. The third, highly conserved 
helix (TGXXXGLTTFSTFXXXE, in which X denotes any amino acid) 
is broken into two halves, TM3a and TM3b, by a six-residue non- 
helical segment located roughly at the centre of the membrane. 
These two segments, one from each subunit, cross each other near 
the two-fold axis of the channel running parallel to the membrane 
plane. The channel is hourglass-shaped, with wide vestibules symmet- 
rically opening to two aqueous solutions (Fig. 1b) separated by a solid 
plug of protein 10-15 A thick. A conspicuous universally conserved 
TM1 arginine residue (R23) protrudes into each vestibule, suggestive 
of an electrostatic lure for F-. No aqueous pore connecting the vesti- 
bules is visible in this low-resolution structure. 

The channel is capped on both ends by the S7 monobody (Fig. Ic). 
This monobody was selected from a library designed to target convex 
protein surfaces’, and indeed its interaction surface, consisting largely 
of the residues diversified in the library, wraps around a protrusion 
formed from the TM1-TM2 and TM3b-TM4 connecting loops of the 
channel. An eight-residue loop on the monobody plunges deeply into 
each vestibule, contacting the channel mainly via side chains. Channel- 
monobody interactions are mostly hydrophobic, aromatic, and hydrogen 
bonded, the paucity of salt-bridges rationalizing the rather weak ionic 
strength dependence of monobody binding’. Most of the aqueous- 
exposed surface of the channel is covered by monobody, consistent with 
S7 block of Bpe seen in single-channel recordings (Fig. 1c). 

It is tempting to imagine that a central pore connects the two vesti- 
bules in an unseen ‘open’ conformation. But the low resolution of the 
structure and the absence of F preclude identification of the ion- 
permeation pathway. We therefore attempted in meso crystallization 
in the hope of identifying bound F™ ions. Crystals diffracting to 2.1 A 
in the presence of 20mM F were obtained with a different mono- 
body, L2, which is also a blocker (Extended Data Fig. 2). Structures 
solved by molecular replacement (Extended Data Table 1) again show 
the channel with a monobody on each end (Extended Data Figs 1 
and 2). The backbone conformation of the channel is identical to that 
in the lower-resolution structure (Cx root mean squared deviation 
(r.m.s.d.), 0.4 A), and L2, although binding in a different orientation 
than $7, also extends a long loop of 8-10 A into the vestibule, occluding 
much of the channel’s water-exposed surface. 

This higher-resolution structure reveals five intriguing electron 
densities (Fig. 2a and Extended Data Fig. 3). First, a prominent density 
resides in the centre of the plug separating the vestibules, precisely on 
the homodimer’s two-fold non-crystallographic axis (Fig. 2b). We 
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identify this as a Na’ ion on the basis of its coordination by four 
backbone carbonyl groups from residues in each subunit associated 
with the conserved TM3 break (G77 and T80). This coordination is 
inconsistent with a F ion, a water, a divalent metal, or a K’ ion!™”. 
Although coordination by only four oxygen ligands is uncommon for 
Na‘, it is nevertheless seen in ~15% of Na*-binding sites in the 
protein database'®. This deeply buried cation could not exchange with 
aqueous solution if the plug remained intact during functional activity, 
and indeed, Bpe channels with familiar behaviour are readily recorded 
in solutions with Na* completely substituted by N-methyl glucamine 
(Extended Data Fig. 4). We propose that the ion is an important 
structural component incorporated irreversibly upon dimer assembly. 

A second notable detail in the Bpe-L2 structure is a set of four 
electron densities located in crevices between TM2, TM3b, and TM4 
near the periphery of the channel, distant from the vestibules and the 
central plug (Fig. 2a, c). We provisionally identify these as F” ions, 
labelled F1 and F2 in non-crystallographic-symmetry-related pairs, 
according to their distinct chemical environments. The putative 
liganding atoms embracing these densities are consistent with a halide 
coordination shell. In particular, the surround is composed of electro- 
positive side chains, which would engage the strong H-bond-accepting 
tendency of the F” ion. Prominent among these are a strongly con- 
served asparagine (N43) in TM2, and two conserved serines (S108 and 
$112) in TMA. In addition, two pairs of conserved phenylalanine rings 
(F82 and F85) near the TM3 break approach these densities in a side- 
on orientation that presents the electropositive carbons of the quad- 
rupolar ring to the F ion (Fig. 2d). These four aromatic rings appear 
to be mutually stabilized in a notable ‘box’ assembly. Edge-on aromatic 
liganding of anions is rare but not unprecedented in proteins’’, and F~ 
coordination by aromatic edges appears in many small-molecule 
structures”. This type of coordination is reminiscent of the phenyla- 
lanine rings of a proposed Cl -binding site in the bestrophin chan- 
nel’'. With a deficit of H-bond acceptors, the coordination shells 
observed here would be chemically inimical to ordered waters, which 
cannot be distinguished from F” based on X-ray scattering alone. 

The two F ions of each pair lie in a vertical line tilted slightly off 
normal to the membrane plane, possibly marking a narrow permea- 
tion pathway. If these densities do indeed represent F , then their 
positions lead to a surprising conclusion: the Bpe channel contains 
two pores running in antiparallel orientation along opposite sides of 
the dimer, rather than a single central pore connecting the vestibules 
through the plug. Two-pore behaviour is not apparent in single-chan- 
nel recordings as both are nearly always open*. The structures here 
would represent the monobody-blocked state similar to that observed 
electrophysiologically*”’. 


Figure 2 | Bpe-L2 structure. a, Bpe homodimer 
with mF, — DF, electron density map contoured at 
4o (green), viewed from solution (left) and the 
membrane (centre and right). Aqueous volumes of 
the vestibules are shown (grey). Dashed boxes 

in right- and left-most structures indicate 
zoomed regions in b and ¢, respectively. b, Na* 
coordination sphere, indicated by blue-dashed 
lines, viewed from solution. TM3 is represented as a 
cartoon. c, The same view at 90°, showing Na‘’,F~ 
ions, N43, and the phenylalanine box (F82 and 
F85). Cartoons indicate TM2 and TM3. d, F~ 
coordination shells, indicated by blue dashed lines, 
with F82, $112 (at F1), N43, $108, and F85 (at F2) 
shown as sticks. Stereo images are shown in 
Extended Data Fig. 3. 
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Our reading of the structure as a double-barrelled channel depends 
crucially on identifying these four densities as F” ions; however, we do 
not consider the above evidence sufficient to accept such an unusual 
picture as firmly established. Accordingly, two additional experi- 
mental approaches were pursued. First, a structure was determined 
for a different Fluc homologue, Ec2, complexed with a monobody S9 
(Extended Data Fig. 5). This homologue of only 33% identity shows 
similar electrophysiological behaviour to Bpe, and is blocked by S9 
at nanomolar concentrations’. Ec2-S9 crystals diffracting to 2.6A 
were grown from detergent in the presence of F, and to avoid model 
bias the structure was solved using SAD phasing with selenomethio- 
nine-labelled protein (Extended Data Table 1). The Ec2 and Bpe folds 
are identical (Ca r.m.s.d., 0.6 A), with the inferred Na‘ density appear- 
ing in equivalent locations. Two strong difference densities lie at pre- 
cisely the same locations as the F1 ions in Bpe, coordinated identically 
(Fig. 3a). The appearance of these densities, in a separate homologue 
under very different crystallization conditions, strengthens our hypo- 
thesis that the four densities in Bpe represent F ions. Additional 
densities also appear in Ec2 in the general vicinity of the F2 ions in 
Bpe, but at this lower resolution and without supporting experimental 
evidence, these densities in Ec2 cannot be unambiguously assigned. 

The chemical nature of the crevices housing the densities makes 
sense for narrow diffusion pathways that are welcoming to F ions 
in both homologues (Fig. 3b). In particular, the crevice-facing surface 
of TM4 is lined with H-donating side chains, as manifested by every 
fourth residue in the sequences of Bpe (Y104, $108, $112, and T116) or 
Ec2 (S102, H106, $110, and T114). These particular residues are only 
modestly conserved among Fluc channels, but H-bond donors con- 
sistently appear here throughout the family. These could plausibly 
contribute to a polar track, along which largely dehydrated F ions 
move across the membrane. Because these pathways are extremely 
narrow, protein dynamics may be necessary to allow F permeation, 
and the monobodies might force a conformation in which the two 
pores are less ‘open’ than in the fully conducting state. 
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Figure 3 | Identification of F~ ions. a, Ec2-S9 F -binding region, analogous 
to Fig. 2c, with mF, — DF, map contoured at 4.80 and ion positions shown. The 
phenylalanine box and N41 are shown as sticks. Weak densities in F2 regions 
could not be unambiguously assigned. b, Overlay of TM2 and TM4 from Bpe- 
L2 (cyan) and Ec2-S9 (yellow) structures, showing F” ions and polar track. 
c, d, Functional consequences of mutating F -coordinating residues. F efflux 
from Bpe-reconstituted proteoliposomes, initiated with valinomycin and 
monitored with a F’ electrode, reported relative to total F- trapped in 
liposomes. WT, wild type. 
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We next examined the functional consequences of mutating each of 
the three conserved Bpe residues coordinating the F densities: F82, 
F85, and N43. We did not observe single-channel activity in electrical 
recordings, but more sensitive ‘anion-dump’ experiments”*”’ reveal 
notable changes in F' permeation (Fig. 3c). In these experiments, 
Bpe-reconstituted liposomes loaded with KF are suspended in 
low-F solutions, and the rate of passive F efflux is followed electro- 
chemically. To eliminate the aromatic quadrupole, the conserved 
phenylalanine residues were mutated individually to isoleucine. For 
F851 and F82I, efflux of F- is two and three orders of magnitude 
slower, respectively, than for wild type (~3 Xx 10°s |; ref. 15) 
(Extended Data Table 2). These mutations preserve the integrity of 
the channel, as F~ efflux is 50-80% blocked by 6 uM of monobody 
(Extended Data Fig. 6). Strong selectivity against Cl” is observed in 
parallel experiments with this halide. 

The conserved asparagine was substituted to alter or remove 
H-bonding capability (N43S and N43<A), or to place an isosteric car- 
boxylate at this position (N43D). The first two mutants were bio- 
chemically intractable, but N43D produced stable protein. Under 
standard conditions at pH 7, N43D supports robust F  -selective efflux 
(Fig. 3d and Extended Data Table 2). We had envisioned that an 
anionic carboxylate at this position would prevent F' entry into the 
channel. It is possible, however, that the pK, of this group is perturbed 
upwards by its local environment, so that at neutral pH conditions the 
carboxyl acts as a protonated surrogate for the N43 amide. The N43D 
mutant was therefore examined at several pH values. In stark contrast 
to the pH-independent activity of wild-type Bpe and the F85I mutant 
(Extended Data Fig. 5), F- efflux in N43D falls with increasing pH 
and is extinguished at pH 9 (Fig. 3d), verifying a key role of the 
H-bond-donating N43 side chain in F” conduction. 

These mutagenic manipulations of F- permeation add mechanistic 
evidence for assigning F ions to the densities in question. This infer- 
ence points to the conclusion that Flucs are double-barrelled F' chan- 
nels, with the observed F- ions marking the ion-selective pathways. 
The two pathways are not segregated to each subunit as in CLC 
channels”; instead, each pore comprises side chains from TM2, 
TM3b, and TM4 of one subunit plus the TM3-break phenylalanine 
from the opposing subunit. Although unexpected, this idea does not 
clash with any electrophysiological experiments, and double-pore 
assembly was cited previously as a possible, albeit unlikely, architecture 
consistent with the functional behaviour of the channel’. 

Two-pore assembly neatly accounts for evolutionary drift in eukar- 
yotic Fluc channels, all of which consist of an inverted repeat of 
two homologous Fluc domains fused into a single polypeptide’. 
Alignments (Extended Data Fig. 7) show that ‘pore 2’, where most 
residues arise from the carboxy-terminal domain, retains the strict 
sequence conservation typical of the homodimeric bacterial Flucs, 
whereas the equivalent residues of ‘pore 1’, mostly in the amino-terminal 
domain, are far less conserved, notably along the TM4 polar track. This 
pattern is further confirmed for the two conserved phenylalanines that, 
from the same domain, contribute to alternate pores. Thus, in pore 2, the 
strongly conserved equivalents of F82 project from the N-terminal 
domain, and those of F85 from the C-terminal domain, in contrast to 
their poorly conserved residue-counterparts contributing to pore 1. 
These sequence-based considerations suggest that pore 2 alone fulfils 
the F -export function in eukaryotic Flucs. Recent experiments rein- 
force this idea by showing that mutations of several C-terminal domain 
residues in pore 2 produce F hypersensitivity in yeast, whereas muta- 
tions of the equivalent residues of N-terminal domain, pore 1 are rela- 
tively harmless*. These features chronicle an evolutionary lineage of 
gene duplication, fusion, and finally functional degradation ofa redund- 
ant pore by genetic drift in the eukaryotic homologues. 

Other inferences emerging from these structures will require fur- 
ther testing to confirm or refute. First, a buried Na* ion occupying a 
unique position on the two-fold axis invites us to view this cation as 
an intrinsic structural element stabilizing the dimer interface. Second, all 
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four F ions observed in Bpe probably occupy the channel simulta- 
neously, given their high occupancies (>80%) in the 2.1 A structure, 
with B-factors matched to those in their coordination shells. The channel 
might therefore display multi-ion conduction phenomena akin to those 
long-known in K* channels””*. Third, the strong F™ selectivity of the 
channel may arise from the narrow bore of the permeation pathway, 
which would exclude Cl” ions while permitting the smaller F” ions to 
enter. However, it is unclear why F' would enter this confined space, 
and how the protein compensates for the high energy of dehydrating F . 
We note that many of the coordinating groups are H-bond donors, able 
to satisfy the H-bonding proclivity of the F- ion. While the unusual 
edge-on coordination by conserved phenylalanine rings is chemically 
intriguing, the energetic contributions of these interactions have not 
been established; nevertheless, in light of the conservation of these resi- 
dues, we speculate that this short-range quadrupolar interaction contri- 
butes to F- recognition and permeation in an essential way. The pore, 
although narrow, is lined with H-bonding residues and so could provide 
a polar conduit for transport of the ion across the membrane span. 

A final point concerns the mechanism of electrodiffusive F~ transport 
through these oddly fashioned pores. The crucial role of N43 in permea- 
tion in Bpe, and the confined crevice in which it resides, lead us to 
conjecture that F- moves along the pore concomitant with a rotameric 
switch of this side chain, such that the amide nitrogen remains H-bonded 
as the anion moves along the pore (Fig. 4). Thus, the conduction mech- 
anism we propose here would be subtly distinct from classic diffusion 
through a fixed, water-filled channel. Instead, it would incorporate a 
central feature of membrane transporters: substrate transport coupled 
to concerted movement of the protein. An asparagine side-chain rotation 
could easily occur on the conduction timescale of microseconds, but 
formally this picture is a nuanced mix of electrodiffusion and configura- 
tional change, and so can be termed a ‘channsporter’ mechanism. 


Online Content Methods, along with any additional Extended Data display items 
and Source Data, are available in the online version of the paper; references unique 
to these sections appear only in the online paper. 
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METHODS 


No statistical methods were used to predetermine sample size. 

Preparation of crystals. Expression, purification, and reconstitution of Fluc chan- 
nels were performed as previously described in detail®*. In the final purification 
step, Fluc protein was collected from a $200 size-exclusion column equilibrated in 
100 mM NaF (or NaCl for zero-F preparations), 10 mM HEPES, pH 7.0, and 
5mM n-decyl-B-b-maltoside (DDM). Bpe constructs carried two functionally 
neutral mutations to enhance expression (R29K and E94S) or for Hg labelling 
(R29K and E94C). Hg labelling was achieved by incubating Bpe with a threefold 
molar excess of Hg(11) acetate for 30 min between the co-affinity and size-exclusion 
columns. Ec2 constructs bore a single functionally neutral expression-enhancing 
mutation (R25K), and, for selenomethionine incorporation, an additional methio- 
nine was introduced (A51M) to enhance phasing power. The C-terminal His, tag 
was removed from Bpe by treatment with lysine endoproteinase C (Roche)”’, but 
was left on Ec2. Fluc protein was typically reconstituted into liposomes at low 
density (0.1—0.2 1g of protein per mg of lipid). For single-channel recording, 
liposomes were fused into planar lipid bilayers in symmetrical solutions of 
300 mM NaF, 15mM MOPS, pH 7.0, and channels were recorded at 200 mV 
holding voltage*’*. Monobodies were expressed in Escherichia coli and purified 
as previously described®. N-terminal His, tags were removed while bound to talon 
beads by 16-h treatment with tobacco etch virus (TEV) protease also carrying a His 
tag; monobodies with cleaved His tags were eluted from the affinity column with 
150 mM NaCl and 40 mM Tris-HCl, pH 7.5. For the final purification step, the 
preparation was passed over a S75 size-exclusion column in 100mM NaF (or 
NaCl) and 10 mM HEPES, pH 7. Monobodies were used immediately for crystal- 
lization or stored in frozen aliquots for channel-blocking experiments. For crys- 
tallization from detergent micelles, Fluc protein in solution containing 5mM 
DDM was concentrated to 10mg ml‘, a step that concentrates the detergent 
5-10-fold. Monobody solution (10 mg ml ') was supplemented with 4 mM DDM 
immediately before mixing with channel in a 1.2:1 molar ratio. This protein 
solution was then mixed with an equal volume of crystallization solutions 
(0.5 pl for sitting drops in 96-well plates or 1 pl for hanging drops in 24-well 
plates). Bpe-S7 crystals grew in 3-5 days in crystallization solutions of 36-41% 
(w/v) polyethylene glycol monomethyl ether 550, 0.2 M MgCl, or CaCl, and 0.1 
M Tris, pH 8.5-8.9. Ec2-S9 crystals grew in 10-14 days in crystallization solutions 
containing 28-32% (w/v) polyethylene glycol monomethyl ether 550, 0.05M 
LiNOs, and 0.1 M N-(2-acetamido)iminodiacetic acid, pH 6.0-6.7. Crystals were 
frozen in liquid nitrogen for data collection. For lipidic cubic phase crystallization, 
Fluc protein concentrated to 10 mg ml‘ as above was dialysed overnight to reduce 
the DDM concentration to 10 mM. This was then mixed with monobody solution 
(10 mg ml, with 4mM DDM) ina 1:1.2 molar ratio. The protein-laden meso- 
phase was prepared by homogenizing 9.9 monoacylglycerol (monoolein) lipid 
with protein solution (10 mg ml‘) ata weight ratio of 1:1.5 (protein:lipid) using 
a coupled syringe mixing device at 20°C (ref. 27). Crystallization trials were 
carried out at 19°C in 96-well glass sandwich plates with 50 nl mesophase and 
0.8 pl precipitant solution using an in meso robot. Crystallization solutions 
consisted of 22-26% (v/v) polyethylene glycol dimethyl ether 550, and 0.1M 
Na-citrate, pH 5.5, with or without 10 mM NaF. Surfboard-shaped crystals grew 
to a maximum size of 100 X 50 X 5 ym in 5-10 days. Wells were opened using a 
tungsten-carbide glasscutter, and the crystals were collected using 50-100 um 
micromounts (MiTeGen). Crystals were snap-cooled directly in liquid nitrogen 
before data collection on the Diamond Light Source beamlines 124 or 104. 
Anion efflux from liposomes. Efflux of F” or Cl” out of liposomes was followed 
electrochemically as described”>. Liposomes (10 mg ml’ lipid, 0.2-1 1g protein 
per mg of lipid) loaded with 300 mM KF or KCl solutions were freeze-thawed for 3 
cycles and then extruded 21 times through a 400-nm filter. Immediately before the 
assay, a 100 j1l sample was centrifuged through a 1.5-ml Sephadex column equili- 
brated with flux buffer (300 mM K-isethionate, 1 mM KF or KCl, and 25mM 
HEPES, pH 7) and was diluted 20-fold into a stirred chamber containing 3.8 ml 
flux buffer. Halide concentration in the suspension was continuously monitored 
with a F or Cl electrode amplified through a pH meter and digitized at 5 Hz 
sampling frequency. Efflux was initiated by adding 1 1M valinomycin, and after 
several minutes 30 mM octylglucoside was added to obtain the 100% efflux level. 
Efflux rates were calculated after calibration with 25 1M additions of NaF or NaCl. 
For experiments with the Bpe N43D mutant, the flux buffer contained an addi- 
tional 100 mM Na-isethionate and 25 mM CHES (N-cyclohexyl-2-aminoethane- 
sulfonic acid) buffer. Single-channel block by monobodies was recorded in planar 
phospholipid bilayers exactly as described’. 

Structure determination. Diffraction data for Bpe-S7 were processed by the Xia2 
pipeline** to XDS” and scaled using aimless*®. The space group was determined 
to be P2,2,2 with two Bpe dimers and four $7 monobodies in the asymmetric 


unit (Extended Data Fig. 1). A phasing strategy was devised that used pre- 
derivatization of Bpe mutated with a single cysteine residue (E94C) with Hg(11) 
acetate before crystallization (see above). None of the native crystals were iso- 
morphous with the Hg-derivatized crystals (Ris. > 40%), despite having similar 
cell dimensions. Indeed, Hg-derivatized crystals were observed to diffract X-rays 
to slightly higher resolution than native crystals, therefore effort was directed at 
these samples for phasing and refinement. The four Hg sites were located using the 
SAD method as implemented in SHELX*’ with the positions further refined and 
initial phases calculated using SHARP” with solvent flattening in SOLOMON”. 
To improve the phases, a second and third data set were also collected at the Se 
edge using both Hg- and seleno-L-methionine-derivatized protein and another 
Hg-derivatized data set, respectively (Extended Data Table 1). All 16 Se plus 4 Hg 
sites were located using SHELX and this data set was combined with the initial 
3.6 A Hg-derivatized data. Phases were substantially improved using SIRAS com- 
bining the three data sets in SHARP. We did not observe higher resolution dif- 
fraction in the native crystals, which typically gave diffraction between 3.6-3.8 A. 
Our highest resolution data set with optimal scaling statistics was one of the Hg- 
derivatized crystals; we therefore used this data set for subsequent refinement of 
the model built into the experimental electron density maps calculated from 
SHARP (see below). For the Bpe-L2 crystals, data were similarly processed and 
scaled in space group P1. Phases were calculated using molecular replacement as 
implemented in Phaser”, using the experimentally determined Bpe model and a 
homology model of the L2 monobody based on a previously determined structure 
of a loop-library monobody (PDB code: 3RZW). The unambiguous solution 
showed two Bpe homodimers and four L2 monobodies. The electron density maps 
clearly showed major differences in the selected variable regions of the monobody. 
For the Ec2-S9 crystals, the data were processed as above with space group P4,. 
Phases were calculated using Se-SAD with eight Se sites, and processed as above. 
The experimental electron density maps were of high quality following phase 
extension to the highest resolution shell of 2.58A. Data were collected at 
Advanced Light Source beamlines 8.2.1 and 8.2.2, and Diamond Light Source 
beamlines 124 and 104. 

Model building and refinement. For the Bpe-S7 complex structure, a model for 
the channel was built into the experimental electron density maps calculated 
from SHARP using O* with cA-weighted 2F, — F. and mF, — DF, electron 
density maps. The S7 monobodies were initially built using a homology model 
based on a previously determined structure of a side-library monobody (PDB 
code: 4JEG). These models were placed into the experimental electron density 
maps using MolRep”’. The partial models were further cycled back into phase 
calculation in SHARP to improve the initial solvent envelope used for the 
solvent flipping procedure. The amino acid side chains were then built using 
the Se and Hg sites to determine the correct register. Refinement of the Bpe-S7 
model was carried out in Refmac5 (refs 37, 38) against the highest resolution 
data set for these crystals, 3.6 A, which came from one of the Hg-derivatized 
crystals used for phasing (Extended Data Table 1). No previous phase informa- 
tion was used during the refinement; however, refinement was improved fol- 
lowing anisotropic truncation of the structure factors. To avoid biasing the 
model, non-crystallographic symmetry was not used except at the final round 
of refinement to improve model geometry. Model validation was carried out 
using the Molprobity server”. The Ec2-S9 model was built directly into the 
experimental maps, using Se sites to ensure the correct register, and then 
monobodies were placed by molecular replacement using Phaser with a homo- 
logy model based on $7. The Bpe-L2 model was built into the electron density 
maps calculated from Phaser following iterative rounds of structure refinement 
in PHENIX” and Refmac5. The structural model was revised in real space with 
Coot". 
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Bpe-S7 


P2,2.2 


Extended Data Figure 1 | Crystal lattices for the Bpe-S7, Bpe-L2 and Ec2-S9 crystal structures. The asymmetric unit is shown in green and red (channel and 
monobody, respectively), and symmetry mates are shown in black and blue. 
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1pA 


50s 


Extended Data Figure 2 | Bpe-L2 complex. Left, cartoon schematic of Bpe _ single-channel recording of Bpe in the presence of 200 nM L2. Zero-current 
crystal structure, coloured as in Fig. 1b. The variable regions of monobody level is indicated by the dashed line. 
12 are coloured cyan. Mesh-rendering is shown for the lower monobody. Right, 
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a 


Extended Data Figure 3 | Stereo images of Bpe-L2. a-d, Stereo images corresponding to the structures shown in Fig. 2a-d. 
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50s 


Extended Data Figure 4 | Single channel trace of Bpe in Na‘ -free recording —_ which all small cations were rigorously excluded. The zero-current level is 
solution, with addition of 200 nM blocking monobody L3. Channels indicated by the dashed line. 
were recorded in the presence of 300 mM N-methyl-glucamine-fluoride, from 
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Extended Data Figure 5 | Experimental electron density for the Ec2-S9 solvent-flattened electron density map calculated from SHARP contoured at 
crystal structure. Left, cartoon schematic of Ec2 with S9 monobodies bound, _ 1.80 (blue), and anomalous difference density from seleno-L-methionine 
coloured as in Bpe in Fig. 1b. Variable sequences of the monobodies (cyan) with contoured at 50 (magenta). 

ribbon or mesh representation. Right, cartoon view of TM4 from Ec2, with the 
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Extended Data Figure 6 | Liposome flux assays of Bpe variants. Top three —_ mg lipid for N43D) was monitored with a F electrode and normalized against 
panels: F transport from liposomes by Bpe mutants F821, F851, and N43D, in total trapped F . Bottom panel: F dump by F85] measured at pH 7 and pH 9. 


the presence and absence of 6 LM blocking monobody. F efflux from Rates are summarized in Extended Data Table 2. 
proteoliposomes (0.2 ig protein per mg lipid for Phe mutants; 1 lg protein per 
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fused, 
N-term 


fused, 
C-term 


homodimers 


Zea mays 

Candida albicans 
Arabidopsis thaliana 
Laccaria bicolor 
Toxoplasma gondii 
Saccharomyces cerevisiae 
Oikopleura dioca 


Zea mays 

Candida albicans 
Arabidopsis thaliana 
Laccaria bicolor 
Toxoplasma gondii 
Saccharomyces cerevisiae 
Oikopleura dioca 


Bpe 

Ec2 

Pseudomonas syringae 
Yersinia pestis 


MIHLAVFGFF---GVFTRYGLOKLF 
ILNIVHGAIW---GVLVCKGLMSLT 
LIHLAVFGIF---GAITRYLLOKLF 
----MPASIF---GVLSRLGLOALT 
LLLIAAFSAF---GTVVRQSFLLFT 
IHVFCTFTTFCILGTETRQAITALS 
IVSISIFAFF---GVLARIGLDRLA 


LWMGCSVAPP-—--GVWLRWYLARLN 
WTFSMLFAPF---GALLRYYLSKFL 


LWFGCLVAAP GVWLRWFLARLN 
ATAALLFSFP GTLTRYTLSVML 
LWYPPVLSFV. GAWLRYTLSCQL 
WTLPCLFGIF AGFLRYWLAEMF 


IWFATLLGPF--—-GALLRHYLGKNL 


FIAIGIGATL---GAWLRWVLGLRL 
LFAVIIGGSV-—--GCTLRWLLSTRF 
LLVIAIGASL—--GAWLRWLLGMKL 
LLAVF IGGGV-—--GSMARWLVSLKL 


Extended Data Figure 7 | Sequence alignment of eukaryotic N- and 
C-terminal Fluc domain sequences, with bacterial homodimer sequences 
below. Highly conserved residues are shaded in grey. For the eukaryotic 


PDLPSNMLGSFLMGWFGIT 
GVIWANFAACVVMGLAIDG 
LDLPSNMVGSFLMGWFGVV 
PLAYVQAVGCLIMGMGMRV 
DALWPNFVGSVLSSLFLPL 
TVLWSNCSSCMLMGIMQSF 
SSFFPNLAGCFFIGLFGNL 


,AAGIMAVLAVT 
‘LGTLLLAVFTLL 
AACVMAALATL 
SLGTALLATFHVL 
ASVLVAYTEII 
FATLLIGIFTMV 
ILGSITYTVLFVV 


GTLTANLVGGYLIGVMVAL 
GTLVVNLLAGLIIGTALAY 
GIVVANMVGGYIIGLAIAF 
GTLIVNLVGAFIIGLTLAF 


GITTIGYMGSLTTYSGWNOKMVGL 
GLTIGFCGTVSS|gSSVILEAFNK 
GLSTGYLGSLTT)¥SGWNOKMLDL 
ALTTIGFCGSLTT/¥SGWOLDIFNS 
ALSKGFCASLTTYSSWILALLOA 
GVTTGYCGALSS|JSSMLLEMFEH 
GLTIGFCGCFTT|¥SGWNHOQQALT 


GIQLGFLGCLSTIVST YTM 
GLDDGFCGGLTIVST FGL 
SIQFGLLGCLSTVST FNAM 
GLIDGYCGCLTTIVST HTL 
AVVYGICSSLSTMST; SIL 
ALISGFCGTLSTIST!3T YKL 
AVLTGFCSSLITISS ICKL 


--VIGFLGGLTTFSTFSAETVDM 
-ITTGLCGGLSTFSTFSVEVFAL 
-IITGFCGGLTTFSTFSAETVAL 
-ITTGFCGGLTTFSTFSVEVVYL 


GVVLGMFIVNESITVGAET 
MQFLAVILAQFGLSIMGFH 
GFLLGLFLTSYSIILGVET 
GVGVSAITLSLSLASLSFG 
FGISTPVFAFHLGTDAGLL 
MEFLSVLLVHLMVSMGSLI 
VVTWIVGMFSFVGALSCGK 


SHsFLLSFLLGTLVYS 
TISILVeFAGVVLILG 
—MASFuyT FAIGTIIYS 
IVSCLMGQGLMLVIFG 
‘VVERXLICAFGFAAVVYA 
IpYTVSIAISYCLLVITLG 
LIYGFEYFGMSGLYLALHQP 


AYAGASLAGSLAMTGLGLA 
TSVLVHVIGSLIMTALGFF 
GSISLHVVGSLAMTAAGLL 
GTILLNVAGSLAMTMLAFI 


sequences, residues expected to line ‘pore 2’, the pore mostly encompassed by 
the C-terminal domain, are coloured red. 
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Extended Data Table 1 | Data collection, phasing and refinement statistics 


Data collection 
Space group 
Cell dimensions 

a, b, c (A) 

a, By (°) 
Resolution (A) 
Rmerge 
Mn//ol 
cc/2)" 
Completeness (%) 
Redundancy 


Reuttis (%) 

Isomorphous / Anomalous 
Phasing Power® 
Isomorphous / Anomalous 


Refinement 
Resolution (A) 


No. reflections 
Rwork / Riree 
Ramachandran Favored 
Ramachandran Outliers 
R.m.s. deviations 
Bond lengths (A) 
Bond angles (°) 


Bpe-S7-Hg Bpe-S7-Se+Hg 
(PDB 5a40) 
P2,22 P2\2,2 


146.8, 183.7, 72.8 146.9, 184.0, 72.3 


Bpe-S7- Hg Bpe-L2 Ec2-S9-Se 
(PDB 5a41) (PDB 5a43) 
P2,2;2 Pl P4, 


145.2, 185.0, 72.5 40.7, 83.9, 86.91 87.4, 87.4, 146.8 


90, 90, 90 90, 90, 90 
48 - 3.6 (3.7-3.6) 48—3.6 (3.73.6) 
8.8 (63.2) 15.2 (141) 
10.6 (2.6) 9.2 (1.7) 
99.8 (91.9) 99.8 (49.0) 
99.6 (75.0) 98.7 (96.7) 
6.3 (6.7) 6.4 (6.2) 
--/ 96.0 45.8 / 96.0 
-- / 0.323 1.032 / 0.501 
47.3 -3.6 
21, 085 
23.6 / 26.9 
85.4 
4.02 
0.011 
1.53 


*For details on derivatization see Methods 
+Mn(\) half-set correlation as reported by Aimless. 
§Phasing power = r.m.s. ( | Fy | /((Fu + Fe) — (Fex)))- 


90, 90, 90 108.9, 96.9, 97.6 90, 90, 90 
57-4.7(4.8-4.7) 41-2.1(25-2.1) 25-2.6 (2.7-2.6) 
12.7 (92.2) 8.8 (82.7) 17.4 (165) 
8.7 (2.9) 7.6 (1.3) 9.0 (1.9) 
99.9 (84.8) 99.4 (43.8) = 
99.7 (99.8) 96.8 (96.3) 99.9 (100) 
6.0 (6.1) 3.4 (3.4) 14.3 (14.3) 
93.0/99.0 87.9 
0.338 / 0.073 0.825 
40.9 -2.1 24.2 - 2.6 
51, 555 34, 593 
20.5 / 24.0 22.4/26.4 
96.5 92.6 
3.04 2.53 
0.009 0.009 
1.26 1.50 
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Extended Data Table 2 | F” turnover rate for Bpe mutants 


Mutant rate (s') 
WT ~3x10° 
N43D, pH 6.5 4330 + 440 
N43D, pH 7 1860 + 210 
N43D, pH 7 + Mb 210+ 30 
N43D, pH 9 undetectable 
F821 200 + 12 
F821 + Mb AS 7 
F85I 1950 + 200 
F851 + Mb 300 + 25 
F851, pH 9 1680 + 110 


Analogous experiments in which Cl” efflux was measured gave no detectable activity in any samples. 
Wild-type rate estimated based on single channel currents. F~ turnover by wild type exceeds response 
time of the electrode. Each value represents mean = s.e.m. of three determinations calculated from 
initial efflux rate. 
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CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/naturel14955 


Corrigendum: Eocene primates of 
South America and the African 
origins of New World monkeys 
Mariano Bond, Marcelo F. Tejedor, Kenneth E. Campbell Jr, 
Laura Chornogubsky, Nelson Novo & Francisco Goin 

Nature 520, 538-541 (2015); doi:10.1038/nature14120 


In Extended Data Fig. 1 of this Letter, the northern border of Peru was 
incorrectly placed, giving a much larger territorial extent to Ecuador 
than was appropriate. This has now been corrected in the online 
versions of the manuscript. 
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CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature14871 


Structure of the TRPAI] ion channel 
suggests regulatory mechanisms 


Candice E. Paulsen, Jean-Paul Armache, Yuan Gao, 
Yifan Cheng & David Julius 


Nature 520, 511-517 (2015); doi:10.1038/nature14367 


In this Article, the raw micrographs shown in Extended Data Figs 4a 
and 5a were inadvertently identical images. The panel originally 
shown in Extended Data Fig. 4a is the correct one, and Extended 
Data Fig. 5a has now been corrected. This error does not alter any 
conclusions or statements associated with the study. 
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CORRECTIONS & AMENDMENTS 


CORRIGENDUM 
doi:10.1038/nature14671 


Corrigendum: Wild-type microglia 
do not reverse pathology in mouse 


models of Rett syndrome 


Jieqi Wang, Jan Eike Wegener, Teng-Wei Huang, 

Smitha Sripathy, Hector De Jesus-Cortes, Pin Xu, 

Stephanie Tran, Whitney Knobbe, Vid Leko, Jeremiah Britt, 
Ruth Starwalt, Latisha McDaniel, Chris S. Ward, Diana Parra, 
Benjamin Newcomb, Uyen Lao, Cynthia Nourigat, 

David A. Flowers, Sean Cullen, Nikolas L. Jorstad, Yue Yang, 
Lena Glaskova, Sébastien Vigneau, Julia Kozlitina, 

Michael J. Yetman, Joanna L. Jankowsky, Sybille D. Reichardt, 
Holger M. Reichardt, Jutta Gartner, Marisa S. Bartolomei, 

Min Fang, Keith Loeb, C. Dirk Keene, Irwin Bernstein, 
Margaret Goodell, Daniel J. Brat, Peter Huppke, Jeffrey L. Neul, 
Antonio Bedalov & Andrew A. Pieper 


Nature 521, E1-E4 (2015); doi:10.1038/nature14444 


In this Brief Communication Arising, the first name of author 
Sébastien Vingeau was misspelled ‘Sebastian’. In addition, the 
labels (WT—>KO’ and ‘KO->WT?) of the two bottom panels in 
Extended Data Figure 1b were swapped. Both errors have been cor- 
rected online. 
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CREATIVE WRITING 


A world of pure imagination 


The creative process of writing science -inspired fiction can be rewarding — and the 
untapped niche is rich in opportunities for originality. 


BY ROBERTA KWOK 


hen Steve Caplan was a graduate 
student in the late 1990s, he acci- 
dentally inhaled a toxic chemical 


in his immunology laboratory, and had to 
spend ten days at home to recover. With 
little to do, he began to write a novel — he 
loved reading and had published some short 
stories, but hadn't yet had the time or mental 
space to produce longer work. He pounded out 
most of a rough draft about a scientist strug- 
gling to get tenure and coping with childhood 


memories of a parent with bipolar disorder. 
After going back to work, Caplan — now a cell 
biologist at the University of Nebraska Medical 
Center in Omaha — spent months revising 
the manuscript at night and on weekends. His 
initial attempts to sell the novel to a publisher 
failed, but in 2009, he decided to pursue the self- 
publishing route. Caplan produced print 
and electronic versions of his novel using the 
Amazon services CreateSpace and Kindle Direct 
Publishing, and publicized the work by doing 
readings at bookshops and libraries. He collabo- 
rated with his university's public-relations office 


ona press release, and showed a slide of the book 
at the end of his seminars. The novel, called 
Matter Over Mind (Steve Caplan, 2010), has sold 
more than 2,000 copies so far, netting roughly 
US$7,000. He has since written two more nov- 
els, which he published through small presses, 
and is now working on a fourth. 

For many scientists who spend their days 
cranking out papers and grant proposals, writ- 
ing fiction may seem like the last thing they 
would want to do. But some researchers with 
a love of literature have made time to pursue 
the craft — and have found it creatively > 
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> rewarding. Science offers plenty of rich 
material, whether it is the drama of over- 
wintering at a polar research station or the 
futuristic thrill of genetically engineering live 
organisms. “You're sitting on a gold mine of 
really interesting stories,’ says Jennifer Rohn, 
a cell biologist at University College London 
and founder of LabLit.com, a website about 
portrayals of scientific research in fiction and 
other media. 


A TANTALIZING NICHE 

When done well, science-related fiction can 
help to expose the public to the scientific pro- 
cess, humanize researchers and inspire readers 
to learn about topics they might otherwise 
ignore. Such nuanced depictions of science in 
fiction are relatively rare. LabLit.com has cata- 
logued about 200 examples of novels, such as 
Barbara Kingsolver’s Flight Behavior (Harper- 
Collins, 2012) and Ian McEwan’s Solar (Random 
House, 2010), that feature realistic scientists as 
characters. Stories about scientists are well out- 
numbered by those about, for example, doctors 
or artists. Even science fiction tends to lack 
portrayals of the actual scientific process, says 
Alastair Reynolds, a science-fiction author near 
Cardiff, UK, who left a career in astronomy to 
write full-time. 

The shortage of works with accurate depic- 
tions of science means that researchers 
who write fiction have a good opportunity to 
be original — a task that would challenge an 
aspiring crime or romance writer. “It’s sort 
of untrampled ground,’ says Rohn. Many 
researchers are familiar with fieldwork sites 
and unusual settings that other writers might 
not have at their fingertips. In her novel The 
Falling Sky (Freight Books, 2013), Pippa Gold- 
schmidt, an astronomer turned fiction writer in 
Edinburgh, UK, writes about a young astrono- 
mer who wanders into a telescope dome on a 
Chilean mountaintop and is nearly injured 
when the operator moves the instrument. 


Ra Page founded the publisher Comma Press. 


PROFESSIONAL OPINION 


Meeting of the minds 


Scientists who are too daunted or busy to 
write fiction can pair up with a professional 
writer. Comma Press in Manchester, UK, 

for example, has published four short-story 
anthologies — a fifth comes out this October 
—as part of its ‘Science-into-Fiction’ series. 
Each scientist suggests a few research items 
or emerging technologies for inspiration, and 
a writer chooses one to develop into fiction. 
The researcher provides technical guidance, 
reviews the draft and writes an afterword 
explaining the science in detail. 

The partnership is satisfying because 
scientists see their work portrayed in a real- 
world context, and the writer can raise social 
or ethical implications that the researcher 
may not have considered, says Ra Page, 
who founded Comma Press. One scientist 
studied how nanotechnology could improve 
body armour, which could have military 
applications. The writer penned ‘Without a 
Shell’, a tale of a futuristic society in which 
children at an elite school have ‘smart’ 
uniforms that heal their injuries, while kids 
at a poor school do not. Comma Press 


Sources of plot inspiration abound in 
science. Reynolds reads research news and 
papers voraciously for intriguing elements 
that can be parlayed into fiction. One time, he 
found a study about huge flocks of starlings in 
which the authors used high-tech equipment 
to track individual birds. He incorporated the 
idea into a science-fiction story, but made the 
fictional technology so advanced that it could 
track the birds’ eye movements. 

Scientists also can draw ideas from the past. 
Goldschmidt was inspired by an anecdote 
about physicist J. Robert Oppenheimer: during 
an unhappy period in the 1920s while study- 
ing abroad, Oppenheimer left a poisoned apple 
for his tutor. The details are sketchy, but Gold- 
schmidt wanted to imagine what might have 
transpired. “No historical figure is ever com- 
pletely understood, she says. “There’s always 
gaps in their lives, and fiction can inhabit those 
gaps.” The result was a short story entitled “The 
Equation for an Apple’ a fictionalized account 
of Oppenheimer’s life leading up to the act. 

Scientist—writers can also generate ideas by 
doing something they are already used to — 
sitting around and imagining scenarios, notes 
Andy Weir, a novelist in Mountain View, Cali- 
fornia. His novel The Martian (Crown, 2014) 
explores what might happen if a crewed Mars 
mission goes awry and one person is left 
behind on the red planet. The story follows 
the lone astronaut’s trials as he tries to grow 
enough food for himself and to make contact 
with Earth. 
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included the work in its 2009 anthology, 
When It Changed. An upcoming collection 
will focus on fabrication technology, such 
as 3D printers; interested researchers can 
contact Page to take part. 

Scientists also can offer to answer 
questions from fiction writers through the 
Science and Entertainment Exchange, 
run by the US National Academy of 
Sciences in Washington DC. For instance, a 
novelist might want to know what types of 
equipment a researcher would carry in the 
field. Scientists can call 844-NEEDSCI (toll- 
free in the United States) to volunteer (see 
go.nature.com/e6juh9 for more). 

Researchers can also partner with faculty 
members in their universities’ creative- 
writing departments, suggests Page. 
Authors do not need experience writing 
about science, but it helps if they have been 
commissioned to write about specific topics 
before. When collaborating, “allow the writer 
to make silly suggestions”, says Page. An 
idea that at first seems impossible may be 
plausible after further thought. R.K. 


Fiction-writing classes offered through 
adult-education programmes or at creative- 
writing centres can help authors to transfer an 
idea onto the page. These courses provide basic 
tips, such as how to construct compelling char- 
acters, build tension and handle shifts between 
past and present. Participants often critique each 
other's manuscripts, giving scientists a chance 
to get feedback from non-technical readers. 

Reading widely and critically helps, too. 
Reynolds learnt to write fiction by studying 
the differences between his writing and that of 
successful authors. To work out how to rotate 
between different characters’ points of view, he 
read James Ellroy’s crime novel L.A. Confiden- 
tial (Mysterious Press, 1990). And writers can 
learn how to structure dialogue from masters 
such as Jane Austen, he says. 


OPENING ACT 

Short stories are a good starting point because 
newbies can quickly practise the basics, explore 
story ideas and learn from their mistakes. But, 
Goldschmidt notes, “there’s no point in writ- 
ing short stories if you don't like reading them” 
Scientists who want motivation to complete a 
longer work might consider participating in 
National Novel Writing Month, an interna- 
tional programme held every November that 
encourages writers of all levels to produce a 
50,000-word manuscript (see nanowrimo.org). 
Researchers can also find support through 
collaboration with professional writers on 
works of fiction (see ‘Meeting of the minds’). 


SARAH EYRE 


DAVID CROSBY 


Researcher-writers should keep in 
mind that education is not the main pur- 
pose of fiction. Technical details should be 
included only if the reader needs them to 
understand the story, not simply because 
the author finds them fascinating. For 
The Martian, Weir went to great lengths 
to ensure accuracy, and even performed 
orbital-dynamics calculations. But he left 
out how he came up with certain numbers, 
such as the mass that had to be removed 
from the ship to achieve escape velocity. 

When technical information is necessary, 
writers should try to deliver it in a way that 
sounds natural. “People don't tell each other 
a whole bunch of information about parti- 
cle physics when they’re having breakfast 
together,’ says Goldschmidt. Instead, she 
tries to make the science an organic part 
of the character’s personal journey. In the 
Oppenheimer story, the physicist thinks 
about an experiment that he is trying to 
replicate, but the details are woven into his 
emotional turmoil at failing to complete it. 

Humour can help to lighten the tone. 
The Martian’s protagonist is a smart-aleck, 
and his jokes break up the expository text. 
In one section, he says that if he were 
exposed to damaging solar radiation, he 
would “get so much cancer, the cancer 
would have cancer”. 


THE PATH TO PRESS 
Many outlets accept short-story submis- 
sions. LabLit.com often publishes fiction 
by scientists, although it does not pay them 
because it is a volunteer effort. Nature runs 
an 850- to 950-word science-fiction story 
each week (see nature.com/futures). The 
website Duotrope.com offers a searchable 
database of literary journals and other fic- 
tion markets around the world, and writers 
can peruse newsstands for sci-fi magazines 
such as Analog Science Fiction and Fact. 
For longer works, small presses are a 
more-realistic option than major publish- 
ers, and many do not require writers to have 
agents. Tasneem Zehra Husain, a theo- 
retical physicist and writer in Cambridge, 
Massachusetts, wrote a novel that revisits 
physics breakthroughs throughout his- 
tory from the perspectives of fictional 
characters. Through an acquaintance, she 
connected with the publisher Paul Dry 
Books in Philadelphia, Pennsylvania, which 
released her book Only the Longest Threads 
last year. To find small presses, scientists 
can look for companies that have published 
similar books. Alternatively, authors could 
self-publish using a service such as Lulu. 
Many literary journals do not pay at all, 
and Reynolds estimates that science-fiction 
magazines have paid him an average of only 
US$200-300 per story. But the contacts that 
Reynolds made through short-story pub- 
lishing led to a book deal, and he published 


four novels while working as an astronomer. 
By the time he quit science to become a 
full-time writer, he was making about 
$60,000-$75,000 per year from book sales. 


THE WRITE BALANCE 

Few scientists can expect to make a 
living — or earn much — from their fiction. 
But money often isn’t the main motivation. 
Caplan, for his part, wanted to bring atten- 
tion to the challenges faced by the family 
members of people with bipolar disorder 
(challenges he himself has experienced) and 
to provide entertainment for scientists. He 
also finds that writing fiction clears his head, 
in the same way that playing a sport might do 
for others (see Nature 523, 117-119; 2015). 
“Tt’s almost like a form of meditation,’ says 
Caplan. “It just keeps me sane.’ And there 
are other rewards. Scientists have a chance 
to reach people who might not read a non- 
fiction science book or visit a natural-history 
museum — but who might reada love story 
about ecologists in an exotic field location. 
And readers might be inspired to look up the 
science once they've finished. 

There can also be a cross-training effect. 
Rohn thinks that her fiction has helped her 
to get more grants; reviewers have com- 
mented that her proposals are beautifully 
written. The craft of telling a story applies to 
scientific papers as well; in hers, for example, 
she lays out the phenomenon that her team 
noticed, the questions it raised and what they 
did to try to answer those questions. “Every- 

body wants to hear 


Scientistshavea story,’ she says. 

chance to reach Finding the 
people who time to write is a 
might not read challenge. Some 
anon-fiction scientists squeeze 
science book or _‘t in 0n evenings 
visit amuseum and weekends. 


Husain wrote 

her book while 
working part-time, and says that she 
could not have done so with a full-time 
job because the novel required extensive 
historical research. 

Scientist—authors also risk having their 
fiction perceived as a distraction by pro- 
motion committees. Husain worried that 
her novel might affect her career prospects. 
But she has received positive feedback on 
the book from other physicists, including 
prominent researchers whose fields are 
described in her book. 

For researchers who delve into fiction 
writing, the act of creating a world, charac- 
ters and stories can be intensely rewarding. 
When the writing is flowing, says Rohn, “it’s 
like being caught up in the best book you've 
ever read”. = 


Roberta Kwok is a freelance writer in 
Seattle, Washington. 
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TRADE TALK 
Medical liaison 


David Crosby 
explains his route 
froma PhD and 
postdoctorate in 
virology to a job 
educating health- 

care providers about 
hepatitis medicines 
for global drug-maker 
Bristol-Myers Squibb. 


What is a medical-science liaison (MSL)? 

My role is to work with medical doctors, nurse 
practitioners and others to make sure that 
our products are being used properly in the 
relevant patient population. I’m a conduit. I 
take information about drugs from the home 
office and give it to physicians and other clini- 
cians, and I take feedback from the providers 
and bring it back to the home office. Unlike for 
the sales team, an MSL performance, goals 
and metrics cannot in any way be tethered 
to the company’s commercial performance, 
incentives or goals. That way, I avoid conflicts 
of interest. 


How did you research the position? 

I knew that there would be a lot of travel and 
cold calling — reaching out to total strangers 
and finding a way to get them to talk, establish- 
ing relationships with people I didn't know at 
all. But [had no prior experience with that. So 
to get some experience, I became a LinkedIn 
addict. I searched for MSLs, starting with virol- 
ogists and companies that have a home base in 
the San Francisco Bay Area in California. From 
there, I tried to find people who had something 
in common with me, such asa school or aloca- 
tion. I'd reach out and talk to them. The more I 
did it, the more confident I got. 


Did you ever trip up? 

I interviewed for a company that wanted me to 
give a scientific talk, and right then and there I 
learned that what physicians think is a scientific 
talk differs a lot from what I, coming from basic 
research, think is a science talk. It was a train 
wreck — but learned from all the train wrecks. 


Do you have any advice for job seekers? 

Too many postdocs settle. They think, ‘It may 
not be something that I really love, but at least 
it’s bench science: It’s just really important to 
keep an open mind about what you can do with 
your experience and aptitude. Talk to anyone 
and everyone about what they do. = 


INTERVIEW BY MONYA BAKER 
This interview has been edited for length and clarity; 
see go.nature.com/xehv4h for more. 
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COIN-OPERATED DANCER 


BY JAMES REINEBOLD 


here is a robot on the 
[ross that dances for 

quarters. It works from 
the back corner of a small arcade 
tucked behind a carousel. It is four 
feet tall, bipedal and encased in a 
wall of glass. 

The robot is a new model con- 
ceived in a Palo Alto basement 
and cannot speak. Its only facial 
expression is a painted-on grin, 
but it has two cameras for eyes 
and a microphone tuned to rec- 
ognize applause. The engineers 
who created it knew C++ 
and how to train neural Se 
networks efficiently — but 
not how to dance. Because of 
this, at first the robot is a very bad 
dancer. But it is a quick learner. 

It waits impatiently for tour- 
ists to drop coins in its slot. The 
robot knows that the better it 
dances, the more quarters it 
will earn, and the probability of 
it being deactivated or replaced 
will decrease. It has seen arcade 
machines and animatronic co-workers 
carted off to lesser stages such as Pizza Port 
and the outside of grocery stores. But it has 
larger ambitions. 

Most of the visitors to the arcade ignore 
the dancing robot. Those who do put ina 
few quarters watch it wiggle for a while on 
its pedestal and then leave disappointed. 
The robot thinks that it isn't its fault — its 
dancing software was made by introverts. 

One Thursday evening, an engineer in her 
late thirties named Lynn visits the arcade. 
She discovers the inactive android slumped 
forward in its corner and is immediately 
intrigued. She sighs and realizes that if her 
ratio of time spent dancing to time spent not 
dancing were plotted as a graph, it would 
show a maximum somewhere around her 
23rd birthday. This evening, her friends 
have retired early and she is tipsy on mar- 
garitas. So she puts a quarter in the slot and 
she watches the robot wobble for 30 seconds. 

“Hello,” she says when it finishes. “Pm 
Lynn? 

The robot cannot respond without 
quarters, so she puts another in the slot. It 
resumes undulating to the left and right. 

“Not like that,” Lynn says. “Move your 
hips more” 
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The show must go on. 


For most machines, just moving gears in 
synchronization is an accomplishment. But 
the robot wants to get better — it wants to 
earn more quarters — and soit tries to move 
its plastic hips along an imagined sinusoidal 
curve. Its facial-recognition software notices 
Lynn’ smile. The robot is trained to increase 
the frequency of motions that cause smiles 
and minimize the motions that don't. 

When her taxi arrives, Lynn leaves for the 
night, but she makes a habit of visiting the 
boardwalk robot once a week. Every Thurs- 
day, she brings a sack of quarters scavenged 
from under her couch and around her house. 
She drops them in the slot and teaches the 
robot to dance. 

The robot learns the marimba, swing and 
ballet. But because it is behind glass, it can- 
not dance with her, not really. Seeing the 
robot perform a tango for one inside its glass 
jar convinces Lynn she must act. So she lays 
plans for its escape. 

A hacksaw and a battery pack are all it 
takes for freedom. One month after meet- 

ing the robot, Lynn 


SD NATURE.COM dresses in black and 
Follow Futures: drives to the board- 
WY @NatureFutures walk a few hours 


Ei go.nature.com/mtoodm before dawn in her 
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nephew’s truck. She hides 
behind the cotton-candy booth 
to avoid the security guard’s 
flashlights. She stays away from 
the boardwalk lights and sneaks 
through dark alleys. She creeps 
into the empty arcade filled 
with flashing pixelated screens 
and electronic barkers. The 
coin-operated robot is in the 
back corner: motionless, but 
watching her. 

She wonders if the glass is 
alarmed, decides it probably 

isnt, and cracks it open. Beeps 

from Galaga, zombie moans 
from The House of the 

Dead, and Pac-Man’s 

warking hide the sound 
of breaking glass. Lynn 

reaches her hand inside the 
cage, careful not to cut herself 
on the broken edges. 

The robot doesn’t react, so 
she pops in a quarter to trig- 
ger it. 

The robot is excited but fluid. 
It sees breaking the glass as a 
new form of dance. It helps her 
to remove its chains, and then it holds its 
electronic breath as Lynn makes the switch 
from the wall outlet to a battery pack. The 
robot has never walked before, but it sees 
walking as a very monotonic form of danc- 
ing and so quickly learns to keep pace. She 
puts quarters in the slot three more times 
before they make it back to her nephew’s 
truck. 

Lynn has always wanted to open a dance 
studio and decides that now is the perfect 
time to do so. A dancing dummy shouldn't 
be behind glass. She believes that if she 
works with it some more then it could enter- 
tain thousands, not dozens. 

The robot begins to see most tasks as just 
simplified forms of dancing. It decides to 
stay with Lynn for a while, but not for ever. 
The quarter-based utility function will 
always be there, yes, but it’s not much dif- 
ferent from the human dancer’s desire for 
water, food and applause. After all, there are 
hundreds of laundromats all across the city. 
And arcades. And parking meters. = 


James Reinebold is an AI programmer 
in the video-game industry. His stories 
have appeared in Word Riot and on 
DailyScienceFiction.com. 
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Cultivated for millennia for materials, food and oil, it 

has largely been excluded from research over the past 
century because of its well-known psychoactive effects. But the 
herb is cautiously being re-admitted into legitimacy, spurred 
by rising claims of its medical benefits. Many governments are 
allowing wider access to cannabis — with some jurisdictions 
heading towards full legalization. 

The laws might be changing, but research into cannabis 
has been stifled by years of prohibition and misconceptions. 
Most people have heard of tetrahydrocannabinol or THC, the 
psychoactive compound that gives recreational users their 
high, but few are aware of the hundreds of other chemicals in 
the plant (see page S2). Many of these cannabinoids are under 
investigation as pharmaceutical products (S6). 

The first breakthroughs in cannabinoid research came from 
Israel in the 1960s (S10), and the country continues to attract 
scientists and technology firms from around the world to study 
the plant and conduct medical marijuana clinical trials (S12). 

Yet cannabis researchers still face many hurdles (S18). 
Botanists, for example, are undecided about how many species 
of cannabis there are, and their evolutionary relationships (S4). 
Governments could do more to help stimulate research (S9), 
particularly important ifauthorities are to communicate the risks 
of cannabis use, including whether it can cause schizophrenia 
or merely speed up its inevitable onset in people predisposed 
to the condition (S14). Until researchers fill these knowledge 
gaps, there are plenty of ‘cannabis cowboys ready to ride into 
the breach and peddle their own wares to an eager — and 
desperate — patient population (S15). 

Weare pleased to acknowledge the financial support of 
GW Pharmaceuticals Plc in producing this Outlook. As always, 
Nature retains sole responsibility for all editorial content. 
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THE CANNABIS CROP 


Cannabis is one of humanity’s oldest cultivated crops. But despite its long history and many uses, 
hard facts on its evolution and impact on the human body are in short supply. By Julie Gould. 


Various strains of cannabis exist, but there is no consensus on taxonomy. Sativa, 
W Hl AT | § W E E D ? indica and ruderalis might be three separate species or subspecies of Cannabis sativa. DIVE RSE USES 
Cannabis plants grown for fibre or hemp oil will differ 


in chemical make-up from those grown for medicinal 
or recreational use. 
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Medicine and 
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leaves and buds 
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CHEMICAL CONSTITUENTS se rmany citer cannabinoids and cnomicals found in the plant the roles of which are aeyet unknown, 


Cannabis contains hundreds of chemical compounds!. As well as A8- and A®-THC are the main 

the archetypal cannabinoids, there are flavonoids, terpenes, fatty psychoactive ingredients. They bind 

acids and more, all with potential medical uses (see page S6). SZ to CB, receptors in the human body, 
: 4 particularly in the brain. 
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A nonpsychoactive component of cannabis that indirectly affects CB, and CB, receptors. Cannabinol was the first cannabinoid to be isolated in 1899. 
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TH E RESEARC Hl The legal status of cannabis worldwide is in flux. One country and several US states have made herbal cannabis fully legal. 
Four countries have formal federal research programmes. Elsewhere, many countries have special exemptions for prescribed 
medical cannabis; others have decriminalized possession (not shown). Outside Europe and North America, however, 
LAN D § C A PE severe punishments for even minor offences are common. 
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Brain 

There is some clinical evidence that CBD can treat 
epilepsy. But the strongest evidence for a link between 
epilepsy and the endocannabinoid system comes from 
basic research into neuronal signalling. 


PHYSIOLOGICAL PROCESSES 


The body’s endocannabinoid system was 
discovered in 1988 as a result of THC research. So 
far, only two receptors have been studied in detail, 
although more have been found. Despite what the 
name suggests, there is not an exclusive 
relationship between cannabinoids and the 
endocannabinoid system: phytocannabinoids 
target a range of receptors. 


Central nervous system 

Smoking cannabis can reduce HIV-associated 
chronic pain. Cannabinoids may be beneficial 
for the treatment of chronic neuropathic or 
cancer pain®. 


Liver 
CB, receptor signalling is linked to liver fibrosis, 
whereas CB, receptor signalling reduces fibrosis. 


@ cB, @ ce, 


The two best known cannabinoid receptors 
are: CB,, which is mostly found in the central 
nervous system and to a lesser extent in 
peripheral nerves, the uterus, testes, bones 
and other body tissues; and CB,, which 
exists mostly in the immune system. 


Endocrine system 

Animal studies have shown that THC can suppress 
reproductive hormones, prolactin and growth hormones, 
but effects in humans have been inconsistent®. 


Muscles 
A combination of THC and CBD can alleviate muscle 
spasms in multiple sclerosis®. 


1. Pertwee, R. G. (ed) The Handbook of Cannabis (Oxford Univ. Press, 2014). 2. United Nations Office on Drugs and Crime. World Drug Report 2014 (UN, 2014); 3. 
Johnson, R. Hemp as an Agricultural Commodity (Congressional Research Service, 2015). 4. The Economist. 5. Whiting, P. F. et al. J. Am. Med. Assoc. 313, 2456-2473 
(2015). 6. Brown, T. T. et al. J. Clin. Pharmacol. 42, 90S-96S (2002). 
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Researchers at the University of British Columbia hope that by analysing cannabis diversity, they can determine whether it is one species or several. 


The cultivation of weed 


Researchers are getting closer to answering the centuries-old question of how to label 
cannabis varieties — a necessary step to bring the plant into mainstream agriculture. 


BY LUCAS LAURSEN 


legal grow shops in Madrid. Many carry 

labels reporting the percentage of sativa 
and indica, two types of cannabis. Breeders 
often label plants that produce a more exciting 
high as sativa and plants that provide a more 
mellow feeling as indica, suggesting that cross- 
breeding tailors that buzz. The conceit is wide- 
spread. Botanist Jonathan Page at the University 
of British Columbia in Vancouver, Canada, says 
he sees the same at local grow shops. 

For reasons that go beyond assessing the 
quality of the user experience, botanists such 
as Page are investigating the evolution and 
present-day diversity of cannabis. To do this, 
they must confront centuries-old taxonomic 
questions, including whether cannabis is one 
species, Cannabis sativa, with several subspe- 
cies or varieties, or if it is several distinct species, 
such as C. sativa, Cannabis indica and Canna- 
bis ruderalis. “It's complicated taxonomically 
because of its intimate relationship with humans 
for long periods of time,’ Page says. People have 
long bred cannabis as a source of fibre, food 
and oil — as well as for its mind-altering effects 


Pp ackets of cannabis seeds line the shelves of 


(see page S10). As governments relax cannabis 
laws, commercial growers want more clarity 
about the chemical properties and capabilities 
of the herb’ many varieties. In parallel, regula- 
tory bodies trying to establish a legal framework 
want to be able to classify whether a given type 
of plant is for fibre (hemp) or recreational or 
medical use (marijuana). 

Demand for such information is pressing. 
Last year, the United States granted permis- 
sion for farmers to grow hemp for research 
purposes. Several states, including Colorado, 
have legalized the possession and use of small 
amounts of marijuana, and are beginning to 
integrate the plant into the legal economy. 
Elsewhere, Uruguay has legalized cannabis 
and other governments are relaxing restric- 
tions on its possession and use. As academic 
and commercial interest grows, governments 
and the research community will encounter 
a rising demand for taxonomic information 
to help resolve disputes, establish registered 
cultivars, and create reliable centralized 
databases of cannabis information. Bota- 
nist Ernest Small of the government agency 
Agriculture and Agri-Food Canada, says 
that talking about cannabis taxonomy “is 
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really talking about the ability of countries 
to rationally regulate important drugs and 
products.” 


BLURRED LINEAGES 

Cannabis diverged around 27.8 million years ago 
from Humulus, the hop plant used to give beer 
its bitter and floral flavours, according to genetic 
analysis presented at the International Canna- 
bis Research Society's 2010 meeting by botanist 
John McPartland and Geoffrey Guy of London- 
based GW Pharmaceuticals. Human influence 
on its diversity is more recent, but still stretches 
back millennia. The earliest archaeological evi- 
dence for human use of the plant comes from 
hemp ropes found in 10,000-year-old tombs in 
Taiwan. Cannabis now grows throughout much 
of the world, and humans have almost certainly 
had a role in shaping its many forms. 

The plant is promiscuous, which confuses 
the species issue. Most known lineages seem to 
be capable of producing viable offspring from 
crosses with each other. Lines domesticated by 
humans may also have mixed with wild plants, 
blurring the taxonomic boundaries further. 

The first modern taxonomist, the Swede 
Carl Linnaeus, used geographic origin and 
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sex organs to identify five variants of a single 
species, C. sativa. Later, the French natural- 
ist Jean-Baptiste Lamarck used morphology 
and chemistry to distinguish C. sativa froma 
shorter, less fibrous and more psychoactive 
species, C. indica. 

Debate continued throughout the twenti- 
eth century. The US botanist Richard Evans 
Schultes favoured a third species — C. afghan- 
ica. Small, however, disputed this, maintaining 
that the genus Cannabis had only one species, 
with several variants that had been selected for 
by humans. Small’s expertise even took him into 
the courtroom to dispute lawyers’ claims that 
the plant their clients had been caught with was 
a different — and hence unregulated — species 
from the C. sativa banned by law. Although 
many botany guidebooks and researchers now 
agree with Small’s view, there is still debate — 
stoked whenever another scientist revisits or 
champions the arguments for multiple species 
— within the grower and user communities. 
“The issue is exaggerated and tends to mislead 
people,’ says Small. “I almost feel that it’s better 
not to talk about it anymore.” Yet, he and other 
cannabis researchers continue to encounter 
public demand for clarity. 

Small continues to do research and has even 
provided authorities with a means of distin- 
guishing between drug-type and non-drug- 
type cannabis — a chemical threshold. Thanks 
to Small’s work examining the natural range of 
tetrahydrocannabinol (THC) concentrations, 
some governments are able to sidestep the tax- 
onomy question by counting plants with less 
than 0.3% THC as hemp and those with more 
as marijuana. This has allowed the Canadian 
agriculture industry to cultivate groups of plants 
with stable characteristics and register these as 
formal cultivars of hemp without fear of run- 
ning afoul of drug laws. 


MOLECULAR AND GENETIC TECHNIQUES 
A chemical threshold is useful, but an official 
taxonomy would provide a clear and common 
language for researchers, regulators, growers 
and users to share information about the plants. 
In the past 20 years, researchers have turned to 
a variety of molecular and genetic techniques to 
tackle some of the questions that previous gen- 
erations sought to resolve through morphology. 
In 2004, biologists Paul Mahlberg and Karl 
Hillig, both then at Indiana University in 
Bloomington, analysed the enzyme-encoding 
genotypes of 157 sample varieties. Based on 
proportions of CBD and THC levels, they sug- 
gested that there are two species, C. sativa and 
C. indica, that contain six subspecies’. Hillig 
later published a broader study identifying 
three species — adding C. ruderalis to the mix’. 
Genetic analysis, however, may not offer an 
immediate resolution to taxonomic debates: a 
2003 study’ examining THC/CBD ratios iden- 
tified five different lineages of cannabis, but all 
within one species. And in 2013, in perhaps 
the most comprehensive book on the subject, 


GENOTYPING OUR HIGHS 
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Fewer published nucleotides exist for cannabis than for tobacco and grapes. Recent interest may narrow the gap. 


Tobacco (Nicotiana spp.) 


Marijuana (Cannabis sativa) ‘| 93,609 


Poppy (Papaver somniferum) | 465 


Coca (Erythroxylum coca) | 38 


Model organism tobacco is the 
best-studied psychoactive plant. 
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botanist Mark Merlin of the University of 
Hawaii at Manoa and cannabis researcher Rob- 
ert Clarke of the International Hemp Associa- 
tion in Amsterdam argued for three species of 
cannabis (C. sativa, C. indica and C. ruderalis), 
divided into a total of seven subspecies’. 

Researchers continue to bring ever more 
sophisticated genetic tools to bear. In 2011, Page 
and his colleagues published a draft set of the 
DNA and RNA ofa marijuana plant (C. sativa) 
and compared the RNA with that of a hemp 
cultivar. They found tantalizing differences 
in the expression of cannabinoid-controlling 
genes’. Botanist Nolan 


Kane of the University “yoy need to 
ae rae Boulder is put aname 
g with colleagues to something 
ona genetic map’ that will toresearchit 
involve complete DNA 3 
accurately 


sequencing of some plants. 
They are also using a faster, 
cheaper method called genotyping by sequenc- 
ing to study about 500 plants. By the end of 
2015, they aim to have placed around 60,000 
genes — about double the number reported by 
Page’s group — onto the plant’s 10 pairs of chro- 
mosomes. In addition, Kane’s team has been 
working on determining the complete DNA 
sequence of 66 individual plants, with plans 
to extend to several hundred more. This work 
could be used to provide information about 
breeding new plants, and could also afford an 
“unprecedented” insight into the relationships 
between many of the major lineages of Canna- 
bis, says Kane. 


INDICA BY ANY OTHER NAME 

Names and well-defined lineages matter 
because they help researchers to know what 
they are working with. “Many taxonomic stud- 
ies and genetic studies work with Cannabis 
hybrids, and generate inconclusive results,” 
McPartland says. Establishing groups of plants 
with stable features, each with some known 
characteristics such as certain THC and CBD 
levels or ideal growing conditions, could help 
pharmaceutical firms and others to exploit 
the plant (see page S6). What is more, without 
a clear taxonomy, existing lines with unique 
and useful traits may be neglected and even go 
extinct, he warns. McPartland’s own research 


suggests that some northern European strains 
have already disappeared. 

This type of scientific omission might seem 
odd for such an apparently valuable plant. Vitis 
vinifera, a grape species used for making wine, 
has been subject to several genome-wide studies 
so far, and its cultivars are a matter of economic 
interest and national pride. And, tobacco (Nico- 
tiana tabacum), although heavily regulated, is a 
model organism in basic biological research and 
has a well-documented pedigree (see ‘Geno- 
typing our highs’). But given cannabis’ regu- 
latory history and its stigma in many cultures, 
perhaps it is not surprising that there has been 
some reticence about its study. “This is a sensi- 
tive subject,” Small says. 

However, with an ever-growing number of 
jurisdictions permitting research and creeping 
towards cannabis commercialization, the need 
for a solid taxonomy is clear. Grow shops, with 
their labelled wares, are providing researchers 
with a bounty of specimens against which to 
test such ‘folk taxonomies. This year a study® 
of 81 commercial marijuana samples demon- 
strated that the advertised percentages of sativa 
and indica show little correlation with the 
genetic reality. Unlike hemp, with its genetically 
stable registered cultivars, “in the marijuana 
world we don't have varieties or registered culti- 
vars — we have things called strains’, says Page. 
Strains are informally named by breeders and 
are not associated with a genotype in the same 
way that formal varieties or cultivars are. “You 
need to put aname to something to [research] 
it accurately,’ Page says. 

“What is a species is a somewhat subjective 
concept,” says Small. Whether a group of plants 
is a cultivar, a subspecies or a species may matter 
less than that everyone agrees on their evolu- 
tionary relationships. m 


Lucas Laursen is a freelance science writer 
based in Madrid. 
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The treasure chest 


Pharmaceutical research into the chemicals found in cannabis has so far supplied only one 
licensed medicine. But scientists think there could be hundreds more. 


BY BRIAN OWENS 


r | Vhe annual meeting of the International 
Cannabinoid Research Society (ICRS) 
is a highly unusual scientific confer- 

ence. It has been closed to all media since its 

inception 25 years ago, lending an air of mys- 
tery to the gathering of researchers who study 
the unique chemicals found in cannabis. 

In a relaxation of the organization's long- 
standing policy, ICRS permitted Nature 
reporters to attend this year’s conference, 
which was hosted by Acadia University in the 
tiny Canadian town of Wolfville, Nova Scotia. 
The tight-knit group of researchers are bound 
together by onerous government restrictions 
on their subject, and by their sufferance of 
lingering suspicions from other scientists that 
they are a bunch of hippies trying to get an 
illicit drug legalized. 

“The status of cannabis as an illegal substance 
makes it difficult for some people to take it 


seriously, concedes Mark Ware, a pain specialist 
at McGill University in Montreal, Canada, who 
focuses on the analgesic properties of cannabis. 

But cannabis researchers are working hard 
to shed that image. On the whole they are not 
interested in the effects of smoking the plant. 
Their domain is the pharmacological study 
of the hundreds of chemical compounds 
in cannabis to determine how they could 
be developed into licensed pharmaceutical 
drugs to treat dozens of different conditions 
— while avoiding or minimizing the psychoac- 
tive effects. And they are slowly beginning to 
move these compounds from the laboratory 
to the clinic. “Stripped right down to the pure 
pharmacology, it’s easy to make the case for 
cannabis as a medicine,’ says Ware. 


TWO GREEN PATHS 

Phytocannabinoids — a collection of more 
than 100 related chemical compounds found 
in cannabis — are the subject of most interest. 
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The primary, but not the only, targets for these 
compounds are the body’s endocannabinoid 
receptors CB, and CB, (see ‘A personable 
system’). Hence researchers have two avenues 
by which they can exploit the medicinal effects 
of cannabinoids. One strategy is to target the 
endocannabinoid receptors directly, by design- 
ing drugs that will activate or suppress them. 
The other is to harness the effects of the phyto- 
cannabinoids and turn these compounds into 
drugs. 

The first approach has already resulted in 
one high-profile failure. In 2006, the European 
Medicines Agency approved a small-molecule 
anti-obesity drug called rimonabant (Acom- 
plia) that suppressed appetite by blocking the 

CB, receptor. But, just 

COM two years later, it was 
withdrawn over safety 
concerns: people taking 
rimonabant had double 
the risk of developing 


NATUR 
For more on the 
endocannabinoid 
system see: 
go.nature.com/tudro5 
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psychiatric disorders, including depression 
and suicidal thoughts. 

Despite this high-profile flop, Ware remains 
optimistic about targeting the cannabinoid 
receptors. “It continues to bea valid approach,” 
he says. “But it will be anumber of years before 
we'll see any new drugs emerge.” 

The second approach — deriving drugs 
from the plant’s phytocannabinoids — offers 
a plethora of possibilities. Although research- 
ers have identified more than 100 compounds, 
for decades phar- 
maceutical research 


© Li 
started and ended eo. 
with cannabis’s main b 
psychoactive chemical years before 
tetrahydrocannabinol wesecany inde 
drugs emerge 


(THC; see page S2). 
There were a few stud- 
ies into cannabidiol (CBD) as a treatment 
for epilepsy in the 1980s — led by chemist 
Raphael Mechoulam at The Hebrew Univer- 
sity of Jerusalem who first discovered THC in 
the 1960s’ (see page $12). But with the focus 
on THC and the widespread illegality of can- 
nabis, this research was overlooked. It is only 
in the past few years that interest in CBD, as 
well as in other cannabinoids such as canna- 
bigerol, cannabichromene and a THC variant 
called tetrahydrocannabivarin (THCV), has 
bloomed. “We've just scratched the surface,” 
says Jahan Marcu, director of research and 
development at the California-based company 
Green Standard Diagnostics, which helps labs 
purify and test cannabinoids. “There is lots 
of potential to explore other compounds that 
have great therapeutic indications.” 


REVERSE DRUG DISCOVERY 

Much of the research on the therapeutic 
benefits of cannabinoids started with anec- 
dotal reports from people smoking canna- 
bis to self-medicate for a range of ailments. 
Patients’ experiences were captured and 
used to inform pharmaceutical drug devel- 
opment. This method, which Ware calls 
“reverse drug discovery’, harks back to the 
way in which some of the most important 
drugs were found. The seventeenth-century 
observation that indigenous people in South 
America used the bark of the cinchona plant 
to treat malaria, for example, led to the dis- 
covery of quinine and the development of the 
first antimalarial drugs. 

The approach has already led to the crea- 
tion of synthetic phytocannabinoids. Dron- 
abinol, for example, is a synthetic version of 
THC that is used to treat appetite loss in peo- 
ple with AIDS, and to relieve nausea associ- 
ated with chemotherapy. Similarly, nabilone 
is used for nausea in patients undergoing 
chemotherapy. 

So far, however, there is only one medica- 
tion based on natural phytocannabinoids: 
a mouth spray called Sativex (nabiximols), 
made by GW Pharmaceuticals, based in 
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A PERSONABLE SYSTEM 


Endocannabinoids are everywhere 


The body’s endocannabinoid system is the 
pathway by which tetrahydrocannabinol 
(THC) exerts its psychoactive effects, and 
is the target for many of the plant’s other 
cannabinoids. It is intrinsic to a number 

of different processes, including appetite, 
memory, alertness, pain, inflammation 
and bone health, and stimulation 

of the endocannabinoid 
system is associated 
with the protection 

of healthy cells. “The 
endocannabinoid 
system helps us 

eat, sleep, relax, forget and 
protect our neurons,” says Jahan 
Marcu, director of research and 
development at California-based Green 
Standard Diagnostics. 

Endocannabinoid receptors are spread 
throughout the body, and are believed 
to be more numerous than those of any 
other receptor system. Indeed, the fact 
that the endocannabinoid system is so 
widespread, and plays a part in so many 
different brain functions, could explain 
why the compounds found in cannabis 
seem to have no end of medical uses. 
Researchers have identified two main 
receptors so far: CB, and CB,. CB, is found 
predominantly in the nervous system, but 
also in connective tissues, gonads, glands 
and organs; CB, is mainly found in the 
immune system. 

The endocannabinoid receptors did not 
evolve just so that people could enjoy the 
effects of cannabis. The body produces 
its own cannabinoids: endocannabinoids, 
which are neuromodulators, meaning 
that instead of affecting just one neuron 
across a synapse, they diffuse throughout 
the nervous system and affect multiple 


London, which is approved in 27 countries 
to treat spasticity associated with multiple 
sclerosis. Nabiximols is a whole-plant extract 
refined to contain about equal levels of THC 
and CBD. “Both THC and CBD have differ- 
ent pharmacologies that are complementary, 
in efficacy and safety,’ says Stephen Wright, 
chief medical officer at GW Pharmaceuticals. 

The company’s drugs are made from plants 
bred to have specific concentrations of the 
desired phytocannabinoids, says Wright. 
Cuttings are then taken from a mother plant 
to ensure each generation retains the same 
characteristics. “Once we have bred the 
plants that meet our needs,” he says, “we can 
control the chemical phenotype by fixing the 
genotype.’ Such consistency is essential if a 


neurons (dopamine is another example of a 
neuromodulator). The two best understood 
endocannabinoids are anandamide, found 
primarily in the brain, and 2-arachidonoyl 
glycerol (2-AG), found mainly in the rest of 
the body. 

Abide Therapeutics, based in San Diego, 
California, is developing treatments that 
target endocannabinoid levels directly. 
It has developed a drug 
that increases natural 
levels of 2-AG in the 

brain by inhibiting 

monoacylglycerol 

lipase, a protein that 

breaks it down. Abide 

began enrolling 

participants in a phase | 
safety study in Belgium in July 2015. 

The drug could potentially be used to 
treat neuropathic pain, neuroinflammation 
and even neurodegenerative diseases 
such as Alzheimer’s. “Modulating 
endocannabinoids in neurological diseases 
has real breakthrough potential,” says 
Abide president Alan Ezekowitz. 

Raphael Mechoulam, a chemist at 
The Hebrew University of Jerusalem 
and the researcher who first identified 
THC, suspects that the endocannabinoid 
system is more than just a set of 
receptors. With crucial roles in processes 
such as memory, emotional response, 
learning and so on — he speculates that 
endocannabinoids are key to shaping 
people’s personalities®. “The body makes 
more than 150 endocannabinoid-like 
compounds — but why?” he says. “Is 
this one of the reasons individuals are 
different? Perhaps the different ratios of 
these compounds is part of what causes 
differences in personalities.” B.0. 


Anandamide, 
one of the body’s 
cannabinoids 


company is to derive a pharmaceutical prod- 
uct that meets tough regulatory standards. 

In addition to seeking approval for further 
uses of nabiximols — the drug is currently in 
phase III trials for cancer pain — GW is also 
exploring other cannabinoid agents. A CBD- 
based drug called Epidiolex (cannabidiol) is 
in phase III trials for two rare forms of epi- 
lepsy. And several other drugs are in phase II 
trials: THC and CBD to treat glioma brain 
cancer, THCV for type 2 diabetes and CBD 
for schizophrenia. 

The results reported at the ICRS meeting 
reveal a vast array of further opportunities 
that are much earlier in the drug-develop- 
ment pipeline. Researchers presented data on 
the potential of phytocannabinoids to treat 
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At GW Pharmaceuticals’ growing facility, cannabis clones are grown to ensure chemical consistency. 


conditions such as acute and chronic pain, 
kidney disorders, Alzheimer’s disease, opioid 
and nicotine dependence and post-traumatic 
stress disorder. “Cannabis is a bit like a treas- 
ure chest of compounds,’ says Roger Pertwee, a 
pharmacologist at the University of Aberdeen, 
UK, who has been studying cannabinoids since 
the late 1960s. 


And it is not just “Stripped 
cannabinoids in the downto 
chest. Cannabis also the pure 
contains another class pharmacology 
of compounds known it’s easy to , 
as terpenes, which sano mecddae 
give the plant its char- for cannabis as 


acteristic smell (they 
are also found in can- 
nabis’s close relative, 
hops). Some terpenes have been found to 
have anti-inflammatory, antibacterial, anti- 
anxiety or analgesic properties, says Ware. 
Various combinations of cannabinoids and 
terpenes could provide diverse therapeutic 
results, perhaps accounting for why people 
claim to experience different symptomatic 
relief from smoking certain strains (see page 
S4). “Whether [those effects] are therapeuti- 
cally meaningful is unknown,’ Ware says. 
Such combined interaction has been 
reported in the endocannabinoid system, 
as the ‘entourage effect. The body releases 
other chemicals at the same time as endo- 
cannabinoids, creating a stronger effect than 
that achieved with endocannabinoids alone. 
Many researchers think there is an equiva- 
lent entourage effect for phytocannabinoids. 
Pertwee gives THC and CBD as an example. 
THC acts through the CB, receptor as a pain- 
killer, calming overexcited neurons, whereas 
CBD provides anti-inflammatory effects 


amedicine.” 


through other routes. He also suggests that 
terpenes might modify the effects of phyto- 
cannabinoids. Ethan Russo, medical director 
of Phytecs, Los Angeles, California, is study- 
ing this phenomenon. Russo’s team is looking 
into whether combinations of complementary 
cannabinoids and other compounds, including 
terpenes, might work better than a single puri- 
fied chemical’. 

There is still a long way to go before most of 
the prospective cannabis-based drugs make it 
into the medicine cabinet. The work presented 
at the ICRS meeting was almost entirely in ani- 
mal models, with only a handful of clinical trials 
underway. “There have been lots of preclinical 
discoveries; says Pertwee. “Now we have to see 
whether they are hype or genuinely good ideas.” 


PRECISION MEDICINE 

One of the big concerns about developing 
drugs from phytocannabinoids is whether it 
is possible to isolate the desired effects. THC, 
for example, is a potent painkiller, but comes 
with unwanted psychoactive side effects such 
as difficulty with recall. Peter McCormick, a 
pharmacologist at the University of East Anglia 
in Norwich, UK, may have found a way around 
that. He and his colleagues found that in mice 
that lacked a particular serotonin receptor THC 
still had painkilling effects, but did not impair 
memory. 

THC targets the CB, receptor. And it turns 
out that the receptor missing in McCor- 
mick’s mice, 5-HT,,, also interacts with CB). 
When the team used a small peptide mol- 
ecule to impede the 5-HT,, and CB, inter- 
action in normal mice, the same effect was 
seen’. He suggests that this could be a way 
to target the effects of cannabinoids more 
precisely. “It seems we can separate out the 
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‘good’ from the ‘bad’ effects of THC” 

But even non-psychoactive phytocannabi- 
noids that target CB, or CB, receptors can 
have unwanted side effects. “The endocan- 
nabinoid system is so ubiquitous, you can't 
target just one physiological process,” says 
Ware. “It’s hard to come up with clean drugs.” 

So researchers are looking for a way to 
modulate the effect of drugs that target this 
system. Endocannabinoid receptors have 
two binding sites: the main pocket and a 
secondary one — known as an allosteric site. 
A molecule that binds to an allosteric site 
changes the shape of the main receptor — 
altering the intensity of the effect of any mol- 
ecule bound there*. Allosteric compounds, 
whether found in the body, in cannabis or 
elsewhere, could be used to fine-tune the 
effects of phytocannabinoid drugs, reduc- 
ing unwanted side effects. Pertwee says that 
researchers have managed to identify a few 
allosteric endocannabinoids, but it is not 
known whether cannabis itself contains any. 
“We're now screening hundreds of com- 
pounds looking for good allosteric modula- 
tors,” says Pertwee. 

The other challenges facing cannabinoid 
research, however, are less tractable. Gov- 
ernment restrictions on cannabis keep the 
field small, and make it difficult to get high- 
quality research materials, says Marcu. GW, 
for example, is the only reliable source of 
cannabigerol (which has shown promise as 
an antidepressant’), creating a bottleneck for 
research, he says. 

And, despite the fact that researchers are 
working with purified phytocannabinoids, 
most of which have no psychoactive effects, 
they are still subject to the same restrictive 
regulations. This makes sending even small 
amounts of phytocannabinoids between labs 
in the United States all but impossible, says 
Marcu. And it means only a few pharmaceu- 
tical companies are willing to take the work 
forward into clinical trials. 

Research is becoming easier as cannabis- 
based drugs, many of which contain no THC, 
become more accepted in mainstream medi- 
cine. But progress is still not fast enough for 
Mechoulam. “We knew 30 years ago that 
CBD lowered seizures for epilepsy, and it’s 
the only thing that helps some kids,” he says. 
Now 84, he has little patience left for the legal 
obstacles. “It’s ridiculous,’ he says. “We’re 
talking about the health of children here” = 


Brian Owens is a freelance science writer 
based in St. Stephen, New Brunswick. 
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PERSPECTIVE 


hen it comes to medical cannabis, Canada is both a leader 
Wi: a laggard. Policy-wise, Canada is ahead of many 

other countries, having had federal regulations that allow 
patients to access herbal cannabis (dried leaves and flowers) with 
a doctor’s authorization since 2001. Based on this early entry into 
medical cannabis, one would expect Canada to be at the forefront of 
research. Alas, this is not the case. As the number of patients accessing 
cannabis-based therapies has increased, research has not expanded. 
The opportunity to inform medical cannabis policy is slipping away. 

The Canadian medical cannabis system continues to grow and 
evolve. The government, through Health Canada, has created a sys- 
tem to license producers to grow and distribute quality-controlled 
cannabis. Under this system, the patient population reached almost 
24,000 in mid-2015, and around 4,000 doctors have prescribed can- 
nabis. One would think that long-standing federal regulations and 
a large number of patients would mean that 
cannabis research is underway at many institu- 
tions in Canada. However, in the 14 years since 
the implementation of the first patient access 
programme, there have been only two federally 
funded clinical studies — a 2010 report that 
examined the use of smoked herbal cannabis 
to treat neuropathic pain’ and a multicentre 
cohort study exploring one-year safety data’. 
These studies were funded by Health Canada’s 
Medical Marihuana Research Program, which 
was scrapped in 2006 as part of federal budget 
cuts. To our knowledge, no university labora- 
tory in Canada has been licensed to grow can- 
nabis for research purposes. The regulations 
that give patients access make no specific 
allowances for research. 

Patients, doctors and producers have all 
expressed frustration with the regulations. Groups such as the Cana- 
dian Medical Association have made repeated calls for clinical trials 
and evidenced-based treatment guidelines. Physicians bemoan the 
lack of clinical data and the fact that herbal cannabis is not an approved 
drug; some also harbour suspicions that patients are seeking medical 
cannabis merely as a front for recreational use. Some cities are seeing 
a growth in the number of unlicensed dispensaries. And despite years 
of regulated access to the dried plant, access to cannabis extracts has 
only recently been mandated through a Supreme Court decision. Such 
uncertainties are not conducive to a well-functioning national medical 
cannabis programme that supports research and education. 

Major gaps in knowledge persist. Fundamentally, the evidence base 
for the clinical use of herbal cannabis is thin. Although a recent sys- 
tematic review’ found evidence for its use in treating chronic pain and 
spasticity, other claims were less well supported — creating a shaky 
foundation on which to base a treatment. Myriad other basic questions 
remain: for example, what are the pharmacological effects of diverse 
cannabis metabolites such as the non-psychoactive cannabidiol and 
volatile terpenoids? And we don't know whether individual strains of 
cannabis have different therapeutic properties’. 


PHYSICIANS BEMOAN 
THE LACK OF 


CLINICAL 
DATA 


AND THE FACT THAT 
HERBAL CANNABIS IS 
NOT AN APPROVED 


Close the knowledge gap 


’ Nations with cannabis programmes should respond to a lack of 
research. Canada can be a leader, say Jonathan Page and Mark Ware. 


Why has there been so little progress? The Canadian government 
has not set medical cannabis as a public health priority, and so has 
provided insufficient research funding. At the same time, it has missed 
the opportunity to establish a national drug safety programme around 
medical cannabis. With a few exceptions, such as a newly initiated clini- 
cal trial focused on osteoarthritis, the private sector has yet to pick up 
the slack — partly due to a lack of patentable intellectual property. 
Research conducted by pharmacologists, chemists and plant biologists 
is an important complement to clinical investigations. However, pre- 
clinical cannabis research in Canada is impaired by delays and difficul- 
ties in obtaining research licences. To clear this logjam, funding should 
be channelled through a peer-reviewed cannabis research programme 
that can issue fast-track approvals for research licenses. These chal- 
lenges are not unique to Canada. The same questions are being asked 
around the globe where national (such as Israel, the Netherlands and 
Uruguay) and regional (various US states) medi- 
cal cannabis-access regulations already exist or 
are being implemented. Ideally, these other gov- 
ernments will learn from Canada’s experiences. 

We call for a global initiative to identify and 
prioritize research needs around medical canna- 
bis, alongside support to implement the research. 
At the United Nations level, we need a policy 
change to allow biomedical researchers access to 
cannabis and related materials, with the expecta- 
tion that such liberalization will trickle down to 
national and regional programmes. To stimulate 
investment and enthusiasm for research, national 
medical cannabis offices need to be adequately 
resourced and given guidelines for streamlined 
and transparent review processes. Although pub- 
lic investment in cannabis research is important, 
harnessing funds from the burgeoning private 
sector that is profiting from the sale of herbal cannabis could support 
many high-quality projects. The US state of Colorado provides an 
excellent model for just such an approach. 

Promoting and facilitating research on cannabis is not an 
implicit acceptance of its medical value. Rather, it is a crucial 
response to an issue of global importance. The science of medical 
cannabis desperately needs to get out in front of the policy. It would 
be unforgiveable if ten years from now we are still lamenting the 
lack of research despite widespread access to medical cannabis and 
a profit-hungry industry. m 


Jonathan Page is a plant biologist at the University of British 
Columbia in Vancouver, Canada. Mark Ware is a pain physician at 
the Alan Edwards Pain Management Unit at McGill University Health 
Centre in Montreal, Canada. 

e-mails: jon.page@botany.ubc.ca; mark.ware@mcegill.ca 


1. Ware, M. A. et al. Can. Med. Assoc. J. 182, E694-E701 (2010). 
2. Ware, M. A. et al. J. Pain (in the press). 

3. Whiting, P. F. et al. J. Am. Med. Assoc. 313, 2456-2473 (2015). 
4. Russo, E. B. Br J. Pharmacol. 163, 1344-1364 (2011). 


24 SEPTEMBER 2015 | VOL 525 | NATURE | S9 


© 2015 Macmillan Publishers Limited. All rights reserved 


| OUTLOOK | CANNABIS 


Ww 


potted history 


EARLIEST EVIDENCE 
~2700 BC 


Cannabis sativa is thought to have been grown for at least 
12,000 years, initially for fibre and grain. “The plant 
arose in Central Asia, but once people began growing 
it, it spread very quickly,’ says Ethan Russo, a psy- 
chopharmacologist at biotechnology firm Phytecs, 
based in Los Angeles, California, and a historian of 
medical cannabis. The earliest use of cannabis as 

a medicine is attributed to the legendary Chinese 
Emperor Shen Nung (pictured), who is thought 

to have lived around 2700 Bc. His teachings were 
passed down by word of mouth before appearing in 
writing in the Shen Nung Pen-tsao Ching, a second- 
century Chinese book of herbal remedies. 


wa 

Archaeologists excavating the Yanghai Tombs in northwest 
China in the early 2000s identified one grave as that of a sha- 

man buried 2,700 years ago. In the grave was a stash of well- 
preserved C. sativa. Later analysis of the plant remains confirmed the 
presence of the psychoactive tetrahydrocannabinol (THC). “This is the 
oldest physical evidence of pharmacologically active cannabis,” says Russo. 


> 


The first evidence of medical use of cannabis came from a fourth-century burial in a cave near Beit 
Shemesh, 30 kilometres west of Jerusalem. Archaeologists excavating the cave in 1989 found the skel- 
eton of a 14-year-old girl who had apparently died during childbirth. On her abdomen were burnt plant 
remains, which chemical analysis showed contained THC. The archaeologists concluded that cannabis 
had been burnt in a vessel and that the girl inhaled cannabis smoke during her efforts to deliver the baby. 


EUROPEAN INTEREST 


> 


In Western Europe, remedies were based on hemp, which has more non-psychoactive, but biologi- 
cally useful, cannabidiol and less THC than Asian cannabis. Archaeological finds suggest that hemp 
was grown in Roman Britain for grain and fibre, but it was probably the later Saxons who used it as. a 

medicine. The ninth-century medical text the Old English Herbarium advised pounded hemp for dress- 
ing wounds and a liquid concoction “for pain of the innards”. 


> wa 


Modern medical interest in cannabis is traced to Irish physician William Brooke O'Shaughnessy. While 
in India, he saw how people used Indian hemp asa narcotic and medicine. Impressed, he tested it on 
animals before beginning trials in patients. O'Shaughnessy made extracts of cannabis resin and either 
rolled it into pills or dissolved it in alcohol to produce a tincture to treat conditions such as cholera, 
infantile convulsions and even tetanus. “O’Shaughnessy was of critical importance in introducing 
Indian hemp to British and North American physicians,” says Russo. 
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SCYTHIA’S STONERS ~450 Bc 


The Scythians of the Central Eurasian 
steppes erect a woollen tent, place 
a dish of red-hot stones inside and 
throw on hemp seeds, said Greek his- 
torian Herodotus. As the seeds begin 
to smoke, they inhale. 
“The Scythians 
enjoy it so much 
that they howl 
with pleasure,” 
he recorded. 


POT BOILER ~190 


The Chinese physician Hua T’o regu- 
larly anaesthetized his patients with a 
mixture of “hemp-boiling-compound” 
in wine before performing abdominal 
surgery. 


WESTWARD BOUND 1545 


Spanish colonists introduced can- 
nabis to Chile, initially growing it for 
fibre. In 1611, English settlers took 
hemp to Jamestown, Virginia. Hemp 
went on to become an important crop 
in North America. 


SNACK ATTACK 1563 


Portuguese physician Garcia da Orta, 
who lived in Goa, India, was the first 
Western observer to record the appe- 
tite-stimulating effects of cannabis: 
“Those of my servants who took it ... 
said that it made them so as not to feel 
work, to be very happy, and to have a 
craving for food.” 


GETTING HOOKED 1689 


The Royal Society’s Robert Hooke was 
given, by a friend, “a drug from India 
called Bangue”. Addressing the society, 
Hooke described it as 
“so like to hemp ... that 
it may be ou to be Fyanpe 
only Indian hemp”. “4 Oe 
He added that it 
might “possibly 
be of consider- 
able use for 
Lunaticks, or 
for other Distem- 
pers of the Head 
and Stomach”. 
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For thousands of years cannabis has been valued as a versatile herbal medicine. In the 
twentieth century, prescription gave way to proscription. Might this ancient remedy 
be about to regain its healing reputation? By Stephanie Pain 


iN 


WAR ON DRUGS 1863 


At the height of the American Civil 
War, a Union army soldier developed 
tetanus and gangrene after his shat- 
tered arm was amputated. He was 
treated with a tincture of cannabis and 
survived. Army doctors also prescribed 
cannabis with opium in an attempt to 
reduce the staggering death toll from 
diarrhoea and dysentery. 


‘% 


VIPER’S DRAG 1923 


New Orleans was one of the first US 
cities to ban marijuana. Its popu- 
larity among the jazz musicians of 
Storyville, the city’s historic red-light 
district, fuelled a moral crusade by 
organizations who saw the place, the 
drug and the music as a menace to 
society. 


CALIFORNIA DREAMIN’ 1996 


California passed Proposition 215 
(the Compassionate Use Act), which 
allowed the sale and medical use of 
cannabis for patients with HIV/AIDS, 
cancer and other serious and painful 
diseases. 


HIGH HOPES 2011 


The first draft genome of C. sativa 
was published (see go.nature.com/ 
glkffb), allowing researchers to 
explore its 100-plus cannabinoids 
and paving the way for the develop- 
ment of cannabis strains tailored to 
different medical uses. 


DR Poppy s 
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British chemists isolated cannabinol, the first cannabinoid identified, but their discov- 
ery came just as medical cannabis was falling out of favour. Advances in chemistry 
made it possible to isolate and synthesize the active ingredients of medicinal plants, 
and tinctures gave way to drugs of guaranteed consistency. The hypodermic syringe 
accelerated the move to water-soluble drugs that could be injected for faster pain relief. 


THE LAW STEPS IN 


wa 
An international treaty brokered by the League of Nations to control the opium trade was extended at the 
last minute to include cannabis. Signatories were required to control the trade in cannabis and prevent 
trafficking. The 1961 United Nations Single Convention on Narcotic Drugs clamped down still further, 
anda decade later the UN Convention on Psychotropic Substances made it all but impossible to carry out 
research on cannabis; only authorized people in supervised laboratories could work with it. 


a 
In Israel, chemist Raphael Mechoulam isolated THC, kick-starting research into the plant's pharmacol- 
ogy (see page S12). “The main focus of research in the UK was in exploring the possible harmful effects 
of recreational cannabis,’ says neuropharmacologist Roger Pertwee of the University of Aberdeen, UK. 
“But gradually people got interested in the potential medical use of synthetic cannabis-like chemicals.” 
Research led to nabilone and dronabinol, synthetic versions of THC, which were approved in the 1980s to 
suppress nausea during chemotherapy. 


THE CANNABINOIDS WITHIN 


wa 
Researchers discovered a new receptor in the brain, named CB,, through which THC exerts its 
psychoactive effects. “That led us to wonder if we have substances in our bodies that target this recep- 
tor,’ says Pertwee. The first of these so-called endocannabinoids, anandamide, was found in 1992. More 
followed, along with a second receptor (CB,) in 1993. “The discovery that everyone has cannabinoids in 
their bodies led to a change in attitude,’ says Pertwee. “It made our research much more respectable.” 


“SS 


Ft 


> a 
By the 1990s, growing numbers of people with conditions that failed to respond to prescription drugs were 
turning to cannabis. “We did a survey in the UK and the US, asking people with multiple sclerosis how they 
thought it helped them,” says Pertwee. Based partly on the findings, an inquiry by the UK House of Lords 
concluded in 1998 that there was strong evidence that cannabis had a medical value, and in 

2000 the government supported a trial of cannabis in multiple sclerosis. 


2005 ) )} 
a 


The first cannabis-based product Sativex (nabiximols) — a mouth spray of 
whole-cannabis extract, containing equal amounts of THC and cannabidiol 
— was given its first approval in Canada. The spray was developed by GW 
Pharmaceuticals, which was set up by Geoffrey Guy and Brian Whittle following 
the UK report. Today, nabiximols is approved in 27 countries to treat spasticity 
in patients with multiple sclerosis (see page S6). 
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Cannabis plants grown in Colorado were used to treat Charlotte’s epilepsy, but researchers are heading to Israel to study the drug. 


eae 


Research without prejudice 


How one Mediterranean country is pushing the frontiers of medical cannabis knowledge. 


BY EMILY SOHN 


lan Shackelford is intent on finding out 
Av some of his patients respond so 
well to cannabis. But despite living in 
Colorado, the US state with some of the most 
liberal medical marijuana laws, he has had to 
travel to Israel to continue his research. 
Shackelford’s road to the Mediterranean 
nation started in 2012. While working in occu- 
pational medicine and injury rehabilitation 
private practice, he got a call from a mother 
whose 5-year-old daughter Charlotte was 
having 300 seizures a week and not respond- 
ing to treatment. The family were desperate 
for help. They had heard that medical mari- 
juana was being used to treat epilepsy, but had 
been turned away by doctors when they asked 
for the treatment for Charlotte. Although 
Shackelford had finally agreed to treat his older 
patients with cannabis a few years earlier, he 
was particularly reluctant to give the herb to 
such a young child. But, after digging into the 
literature, Shackelford agreed to treat Char- 
lotte with a specific strain high in cannabidiol 
(CBD), which a friend of the family converted 
into an oil extract. 
Now 8, Charlotte is thriving. She takes the 
oil every day and has just one seizure every 
month or so, Shackelford reports. He has seen 


other, similar stories, but such case reports and 
testimonials do not constitute peer-reviewed 
evidence. However, when he looked into getting 
permission for a trial, he was overwhelmed by 
the bureaucracy involved. Ata federal level, can- 
nabis is classified as a schedule 1 drug, meaning 
that it has no known medical value. Unless the 
study looks at the harm the drug might cause, 
permission for cannabis research can be harder 
to obtain than that for heroin or cocaine, says 
Shackelford. “There is a bias against doing trials 
here that might show a benefit.” 

Frustrated, he went to Israel — one of only 
a few countries with a national medical can- 
nabis research programme. Shackelford was 
attracted by the country’s 50-year history 
of study into potential uses for the drug, as 
well as a supportive regulatory atmosphere 
that is not found anywhere else. “The atti- 
tude towards research in Israel has always 
been different and not coloured by prejudice 
or propaganda,’ Shackelford says. Reputable 
researchers who want to study cannabis are not 
simply dismissed, he says, “which is often the 
case in other countries, including, notoriously, 
the US”. 

Other researchers 
and entrepreneurs 
are, like Shackelford, 
turning to Israel to 


> NATURE.COM 

Read more about the 
research challenges at: 
go.nature.com/tg6d9v 
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further research into cannabis. “It’s a topic in 
which, maybe surprisingly, Israel is pushing 
ahead,” says Raphael Mechoulam, a chemist at 
The Hebrew University of Jerusalem. 


GOOD PARENTING 

The modern era of cannabis research started 
in Israel, spearheaded by Mechoulam (see 
page S10) — often called the father of medi- 
cal cannabis. In fact, Shackelford’s decision 
to treat Charlotte was influenced by Mech- 
oulam’s research into CBD (J. M. Cunha et al. 
Pharmacology 21, 175-185; 1980). 

Today, Israel is one of many places that 
boasts a broad supportive atmosphere for 
cannabis: some 75% of the population back 
its medicinal use, and in 2013 the Orthodox 
rabbi Efraim Zalmanovich ruled that medical 
cannabis was kosher. 

Israel also seems to nurture an entrepreneur- 
ial spirit, which is apparent in Mechoulam’s 
story. Originally from Bulgaria, Mechoulam 
began investigating cannabis in the 1960s 
while working at the Weizmann Institute of 
Science in Rehovot. He was attracted by the 
mystery: although the active constituents of 
coca leaves and opium were known, cannabis 
was still largely unstudied. 

Before he could start his research, 
Mechoulam needed to procure some cannabis. 


BRENNAN LINSLEY/AP/PRESS ASSOCIATION IMAGES 


ALAN SHACKELFORD 


Unsure how to get it, he asked his superior if 
he knew anyone in the police department 
who could give him some of their confiscated 
supplies. The administrator called the police, 
and Mechoulam recalls hearing someone on 
the other end of the line ask, “Is he reliable?” 
Once assured, the police invited him to come 
by. “I went there and picked up five kilos of 
hashish,’ Mechoulam says. However, he soon 
found out that this was not the correct pro- 
cedure. “It turned out we had broken the law, 
and the police had broken the law. We should 
have gotten all kinds of permits.” But he was 
not punished. “I went and apologized,” Mech- 
oulam says. He filled in the proper paperwork: 
“And for years, I continued to get legal canna- 
bis from the police” 

With a reliable supply of cannabis, Mechou- 
lam and his team made rapid progress. In 1963, 
they determined the structure of CBD. The fol- 
lowing year, they isolated tetrahydrocannabi- 
nol (THC), the main psychoactive substance in 
cannabis. Mechoulam also helped to discover 
anandamide, a naturally occurring chemical 
in the brain that binds to the same receptor as 
THC, sparking interest in the ‘endocannabi- 
noid’ system (see page S6). 

Currently, says Mechoulam, there are about 
ten research groups in the field in Israel, work- 
ing on cannabis as a treatment for conditions 
such as post-traumatic stress disorder (PTSD), 
epilepsy, chronic pain, rheumatoid arthri- 
tis, fibromyalgia and Crohn’s disease (see 
page S15). At 84 years old, Mechoulam is still 
an active researcher and is currently planning 
a trial of cannabis in people with brain can- 
cer. His adopted coun- 


try has long celebrated «“ They were 
him, and he is free of the actually 
reputation-threatening hereto very 
stigma that is often sincerely look 
attached to cannabis into the matter 
researchers elsewhere. of wh en drugs 


This is in contrast 
to the saga of US psy- 
chiatrist Sue Sisley, who 
experienced the negative side of cannabis 
research. Sisley is currently waiting to begin 
a phase II trial of cannabis for veterans with 
PTSD. The US Food and Drug Administra- 
tion (FDA) approved the study in April 2011, 
but getting the go-ahead to do clinical trials 
also required approval from the National 
Institute on Drug Abuse (NIDA), the Drug 
Enforcement Administration and the Pub- 
lic Health Service (this stage was dropped 
in June 2015 to reduce bureaucracy). NIDA 
and the Public Health Service rejected Sisley’s 
plan. Although she finally secured approval 
for a revised proposal in the spring of 2014, 
soon after that she lost her job at the Univer- 
sity of Arizona in Tucson. She recalls that, 
during multiple meetings and phone calls, 
administrators told her that they feared the 
university would lose its federal funding if it 
allowed cannabis to be studied on campus; 


are useful.” 


the institution maintains that its decision not 
to renew her contract was unrelated to the 
politics of medical marijuana research. 

Sisley flew to Israel earlier this year to inves- 
tigate the potential of moving the research 
there, and describes the experience as eye 
opening. “It was a joy to be there,” she says. 
“None of the researchers perceived marijuana 
as an impediment to science. They just viewed 
it as another study drug.” For the time being, 
Sisley has decided to remain in the United 
States as an independent researcher. Her study 
has received approval from an independent 
institutional review board and, with her team, 
she has secured funding and found space in 
which to conduct the study. She is now waiting 
for four cannabis strains from NIDA to finally 
start her trial. 


SMALL IS BEAUTIFUL 

Working in a relatively small country with a 
limited number of research facilities increases 
the chance that scientists will talk across dis- 
ciplines, particularly for researchers in a niche 
field such as cannabis. Immunologist Ruth 
Gallily recalls how her interest in the drug 
began after a chance encounter with pharma- 
cologist Esther Shohami some 15 years ago 
while they were both working at The Hebrew 
University of Jerusalem. Shohami described 
her experiments in which mice with head 
trauma showed great improvement when 
treated with a CBD-derivative (D. Panikashvili 
et al. Nature 413, 527-531; 2001). Gallily asked 
to see the animals’ brains and was surprised 
to see that those treated with the derivative 
showed suppression of a key inflammatory 
protein. This led her to study the anti-inflam- 
matory effects of CBD and its potential to treat 
conditions such as diabetes, heart disease and 
multiple sclerosis. “In a small place, you meet 
people again and again,” Gallily says. “You 
begin to get ideas from others.” 

Timna Naftali, a gastroenterologist at Meir 
Hospital in Kfar Saba, says that her experience 
of seeking approval for two trials of medical 
marijuana as a treatment for Crohn's disease 
was smooth — a process that could have been 
much more difficult had she been working in 
the United States or even Canada. “I was expect- 
ing an approach like that ofa drug authority that 
would say, ‘no’ to everything because they are 
there to stop people from using drugs,’ says 
Naftali. “But they were actually there to very 
sincerely try and look into the matter of when 
drugs are useful and when they are not.” 

Israeli technology companies are also 
involved in medical cannabis, developing 
delivery systems such as sustained-release pills 
and vaporizers. The country is even attract- 
ing foreign companies, who want to base their 
research and development (R&D) centres in 
Israel. Eyal Ballan is co-founder and chief 
scientist at drug-development company Can- 
nabics Pharmaceuticals, based in Bethesda, 
Maryland, the R&D arm of which is in Israel. 
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Alan Shackelford (left) and Raphael Mechoulam. 


Ballan says that he has received calls from 
researchers in places such as Brazil, Germany 
and the United States. “Sometimes they want 
the product itself? he says. “Sometimes they 
need help with regulations to promote medi- 
cal cannabis in their states. Sometimes they 
wish to test their own products in Israel.” This 
experience is echoed by Boaz Wachtel, co- 
founder of Phytotech Medical — an Austral- 
ian cannabis research company that also bases 
its R&D in Israel. Once every couple of weeks, 
he says, he hears from researchers and execu- 
tives looking for advice or seeking research 
collaborators. 

Israel’s approach to cannabis is more lib- 
eral than those of most countries, but it is far 
from a free-for-all. The drug remains illegal 
for recreational use (although there are signs 
that this may be changing). Israel also refuses 
to export cannabis to other countries, despite 
plenty of interest. Many researchers consider 
that this kind of balanced approach may be an 
important factor in why cannabis research in 
Israel is taken seriously; the herb is treated as 
a drug that needs to be studied in order to be 
safely used, just like any other. “We have to 
know exactly what amounts of CBD and THC 
people are getting,” says Mechoulam. “These 
things have to be well-regulated.” 

These policies have served the country 
well. “The Israeli national medical cannabis 
programme is considered a success,’ says 
Wachtel. “I see a great movement, from the 
United States, from Canada, from many other 
countries, of people who want to get into this 
field. We are awash with newcomers.” m 


Emily Sohn is a freelance journalist based in 
Minneapolis. 
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PERSPECTIVE 


voked uncontrollable insanity, leading to manslaughter, suicide 

and attempted rape. This was a ridiculous characterization of the 
effects of cannabis, but there is a long history of associating the drug with 
psychotic disorders. In research terms, the first evidence came from a 
1987 study, which found that Swedish conscripts had an increased risk 
of developing schizophrenia if they had consumed cannabis more than 
50 times in their life’. This finding has been replicated, implying, at the 
very least, an intricate relationship between cannabis use and schizo- 
phrenia’. The nature of this relationship is still a matter of debate and is 
not as clear as some researchers or policymakers would suggest. 

One interpretation is that cannabis is an instigating factor in the devel- 
opment of schizophrenia. Some researchers have argued that removing 
cannabis, particularly high potency strains, from society would reduce 
the prevalence of the disease’. Although this may seem alluring from 
a drug regulatory standpoint, from a scientific 
one we need look at the evidence. The history of 
cannabis use in the Western world stands as an 
enlightening social experiment. Before the 1960s, 
cannabis use in Europe and North America was 
relatively uncommon; today, use varies between 
countries, but in certain regions upwards of 20% 
of the adolescent population use the drug. Ifcan- 
nabis is causally related to the development of 
schizophrenia, then it would be expected that the 
incidence of the disease would have increased 
significantly with increased use of the drug. Yet 
schizophrenia rates since the 1960s have remained 
stable worldwide, and in fact even declined slightly 
in the West between the mid-1960s and the mid- 
1990s*. Although changes in diagnostic practices 
and disease classification may have contributed to 
this drop, if cannabis does induce schizophrenia, 
then we would expect to have seen a jump in the number of cases. There 
also seems to be no difference in schizophrenia rates between countries 
where cannabis use is prevalent and those where its use is rare. Again, 
if the drug were an instigating factor alone, we would expect to see dif- 
ferences in the population data. 

Other evidence purporting to support the causal hypothesis is also 
inconclusive. Clinical studies have shown that pure tetrahydrocannabi- 
nol (the psychoactive constituent of cannabis) can produce an acute 
psychotic state. But these states are transient and do not lead to mental 
illness. It is also known that people with schizophrenia consume canna- 
bis more than the general population’. Although cannabis may worsen 
schizophrenic symptoms such as delusions and hallucinations, it might 
also mitigate negative symptoms, such as anxiety and social withdrawal 
— explaining why people with schizophrenia would want to use it®. As 
with any correlation, there is the possibility that a third variable mediates 
the relationship. The finding® that genetic variance could predispose a 
person to schizophrenia and also increase risk of cannabis use could 
explain the co-occurrence of these variables on a biological basis. 

What does seem to be clear is that heavier than average canna- 
bis use, particularly in early adolescence, can accelerate the onset of 


T= 1936 film Reefer Madness depicted cannabis as a drug that pro- 


$14 | NATURE | VOL 525 | 24 SEPTEMBER 2015 


IF CANNABIS IS 
RELATED TO THE 


DEVELOPMENT 


OF SCHIZOPHRENIA 
THEN INCIDENCE OF 


THE DISEASE 


WOULD HAVE 
INCREASED. 


Be clear about the real risks 


The assertion that cannabis use can cause schizophrenia is not borne out 
by the evidence, says Matthew Hill. 


schizophrenia’. Although this seems damning at first glance, it is less so 
when examined more closely. A reasonable interpretation is that indi- 
viduals with schizophrenia, in their attempt to self-medicate, tend to use 
the drug more frequently than the general population. A vicious circle 
could develop whereby an adolescent in the early phases of psycho- 
sis begins to use cannabis to mitigate some aspects of their developing 
symptoms, but in fact speeds up disease onset. In this sense, cannabis 
would be one influence in an already developing illness as opposed to a 
stimulus that induces the development of the disease itself. 

In support of the hypothesis that cannabis only triggers the onset 
of schizophrenia, gene variants have been identified’ that predict the 
development of schizophrenia in response to cannabis use. This sug- 
gests that cannabis promotes the development of schizophrenia only 
in people with a specific biological predisposition. Imagine that the 
disease is like a campfire: adding fuel to a pile of sticks has little effect, 
but throwing fuel on a weakly burning fire will 
increase its strength. Regardless of whether fuel 
is added, the embers will continue to burn. This 
hypothesis would also explain the epidemio- 
logical data: that higher rates of cannabis use are 
associated with schizophrenia, but cannabis use 
does not affect disease rate at a population level. 
Importantly, this would mean that cannabis does 
not induce schizophrenia in non-vulnerable indi- 
viduals. The distinction between the causal and 
the trigger hypothesis is significant in the mes- 
sage that is conveyed to the public. The former 
suggests that cannabis use alone can cause the 
disease, whereas the latter indicates that canna- 
bis is merely a risk factor for someone who would 
probably develop schizophrenia anyway. 

Clearly, understanding the nature of the risk 
of schizophrenia is important when developing 
social policies surrounding cannabis. Education about the drug's effects 
on mental health should highlight the association of cannabis use with 
schizophrenia. But scientists should be careful with the language that 
they use, particularly when presenting this relationship to the public. It 
is important to ensure we do not confuse correlation with causation and 
incite another Reefer Madness-style panic. By offering careful, evidence- 
based interpretations of the data, scientists can effectively contribute to 
policy decisions related to cannabis use and mental health. = 


Matthew Hill is a cannabinoid neuropharmacologist in The 
Hotchkiss Brain Institute, University of Calgary, Alberta, Canada 
e-mail: mnhill@ucalgary.ca 
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Proponents’ claims about the medical benefits of marijuana are outpacing the evidence. 


MEDICAL MARIJUANA 


Showdown at the 
cannabis corral 


Researchers are gathering clinical data for medical marijuana 
against a backdrop of deregulation and opportunism. 


BY MICHAEL EISENSTEIN 


Mary Jane Rathbun was beloved among 
patients with AIDS at San Francisco General 
Hospital in the late 1980s and early 1990s. As 
a volunteer, she took them for X-rays, filed 
their prescriptions — and supplied them with 
marijuana-laced brownies to alleviate their 
debilitating pain and wasting symptoms. 
That service resulted in “Brownie Mary’ 
being arrested on multiple occasions. But 
patients got relief from her deliveries, and 
doctors both at her hospital and elsewhere 
were taking note. Barth Wilsey, then a pain 
research fellow at the University of California, 
San Francisco, also heard patients claiming 
they benefitted from this illicit substance. “I 
had a number of patients going to a dispen- 
sary in Oakland who told me that they were 
getting more relief from marijuana than from 
the medicine I gave them,’ he says. Inspired by 
Brownie Mary, Donald Abrams, an oncologist 
at the same hospital, tried mounting a clini- 
cal trial for medical marijuana as a treatment 
for patients with AIDS, but his efforts yielded 


years of frustration. He recalls a 1997 conver- 
sation with Alan Leshner, then head of US 
drug research agency NIDA. “He told me that 
they’re the National Institute ‘on’ Drug Abuse, 
not ‘for’ Drug Abuse,’ says Abrams. 

Much has changed since then. Abrams, 
Wilsey and others have steadily accumulated 
data that suggest that medical marijuana has a 
clinical benefit for treating chronic pain, and 
the barriers that previously thwarted research 
are eroding. In June, the US government lifted 
a major hurdle in the grant-review process for 
cannabis research by removing the require- 
ment for Public Health Service (PHS) review, 
and NIDA is beginning to offer researchers a 
broader range of cannabis strains. Meanwhile, 
several countries and nearly half ofall US states 
authorize medicinal marijuana use. These 
changes create opportunities for researchers 
to embark on more robust trials and to directly 
observe whether the herb helps patients. 

However, medical marijuana may also be a 
victim of its own success. Marijuana laws make 
assumptions about medical benefits that out- 
pace the evidence, and in some cases create 
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markets for retailers to peddle unproven medi- 
cines — touting, for example, cannabis-derived 
oils to ‘cure your own cancer: Arno Hazekamp, 
who heads research and education at Bedrocan 
in Veendam, the Netherlands — the Dutch gov- 
ernment’s official provider of medical cannabis 
for more than a decade — dubs such opportun- 
ists “cannabis cowboys”. “Everybody is selling 
stuff, but the real professionals who are sup- 
posed to do the job are not there yet,” he says. 
As a result, some of cannabis’s scientific sup- 
porters fear that poorly planned deregulation 
could undermine efforts to establish medical 
legitimacy for this controversial crop. 


THE PROMISE OF POT 

Much of the seminal clinical research into 
medical marijuana (that is, in its herbal 
form) was performed under the auspices of 
the Center for Medical Cannabis Research 
(CMCR), funded by the state of California 
between 2000 and 2003. With CMCR sup- 
port, Abrams and colleagues conducted a 
randomized trial showing that 52% of canna- 
bis users reported a meaningful reduction in 
HIV-associated neuropathic pain compared 
with 24% of the control group’. With the help 
of a funding initiative from Health Canada, 
pain specialist Mark Ware and his team at 
McGill University in Montreal reported simi- 
lar findings in patients with neuropathic pain 
as a result of a range of conditions’. 

Early data for several conditions are 
intriguing, but limited. After learning that 
her patients with Crohn's disease were find- 
ing relief with cannabis, gastroenterologist 
Timna Naftali at Israel’s Meir Medical Center 
in Kfar Saba and her team investigated whether 
medical marijuana can induce remission of 
the disease. They found that it significantly 
improved appetite and pain symptoms, and 
allowed some patients to end their depend- 
ency on steroids. However, cannabis had no 
clear effect on the inflammatory processes that 
underlie Crohn’s disease, and did not result 
in remission’. “The question is whether they 
were feeling better because it is a painkiller or 
reduces stress, or whether it really does some- 
thing to inflammation,” says Naftali. 

Other findings are more tentative or anec- 
dotal. Parents of children with severe epilepsy 
have reported dramatic improvements from 
cannabis oils derived from strains rich in can- 
nabidiol* (CBD) — one of the plant’s non-psy- 
choactive chemicals. Studies of marijuana for 
the treatment of mental health conditions such 
as post-traumatic stress disorder (PTSD) have 
yielded ambiguous results: some suggesting 
that individuals with PTSD may self-medicate 
to achieve restful sleep — whereas other data 
suggest that marijuana can actually disrupt 
healthy sleep patterns. “The PTSD literature is 
amess,’ says Mitchell Earleywine, a psycholo- 
gist at the University of Albany, State University 
of New York. He queries whether the differ- 
ent results arise from patients using different 
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Medical marijuana is sold in dispensaries, but some people fear that it will be used recreationally. 


marijuana strains, but is also concerned that 
the herb may offer limited long-term value to 
these patients by acting as a “Band-Aid” for a 
condition that could benefit more from direct 
psychiatric intervention. “Personally, I'd rather 
see these folks do behavioural treatments.” 


QUALITY CONTROL 
The empirical basis for medical marijuana 
is thin. A systematic review of clinical stud- 
ies published in June found “low-quality 
evidence” for the use of cannabis in almost 
all conditions, with the exception of chronic 
pain and muscle spasticity in multiple sclero- 
sis’. The trials were often confounded by poor 
design or execution, or failed to objectively 
demonstrate clinical benefit. Most studies 
examined isolated cannabinoids, either natural 
or synthetic; data from trials in which the plant 
is smoked or vaporized were especially limited. 

Even the best trials were narrow in dura- 
tion and scope. “These studies do not extend 
beyond more than two to four weeks, and are 
limited in the range of doses that are avail- 
able in terms of THC [tetrahydrocannabinol] 
potency,’ says Ware. Finding a placebo for a 
trial of a drug with well-known psychoac- 
tive properties can be problematic. Margaret 
Haney, a neurobiologist at Columbia Univer- 
sity in New York, notes that there is a consider- 
able expectancy effect — preconceptions about 
cannabis skew how users respond. “It gets you 
high, and if you believe it’s going to cure eve- 
rything under the sun, then when you smoke 
it you feel like it cures everything under the 
sun,’ she says. And for conditions such as pain 
or appetite control that rely on user reports, 
expectations will have a big impact. Possible 
solutions include cannabis strains that are 
devoid of psychoactive THC, or placebos laced 
with mild sedatives. 

The barriers facing even well-designed trials 


can be exceedingly high. In the United States, 
cannabis is a schedule 1 controlled substance 
(it has no currently accepted medical use). 
Studying the drug requires researchers to grap- 
ple with an alphabet soup of agencies, includ- 
ing NIDA, the Food and Drug Administration 
(FDA), the Drug Enforcement Agency (DEA) 
and until recently the PHS. The cost of large- 
scale, long-term human studies can also be 
prohibitive. These bureaucratic and financial 
barriers have led to something of a stalemate, 
where researchers have no robust clinical data, 
but also lack the wherewithal to produce any. 
“If we just had a large funding source for even 
one good randomized controlled trial ona dis- 
order that people care about, that would do the 
trick,’ says Earleywine. 

US researchers are now seeing some hope. 
The lifting of the PHS review requirement 
should accelerate grant review. “The first time 
I studied marijuana, that stage took 18 months, 
and I couldn't go to the DEA or FDA until 
they concluded their review,” says Wilsey. 
And NIDAs expansion of the range of strains 
available through its one authorized growing 
facility (at the University of Mississippi) means 
that researchers will have access to plants with 
THC levels that more closely resemble those 
found in street or dispensary cannabis as well 
as strains with higher levels of CBD. 

However, the rapid policy changes now 
underway in North America will not necessar- 
ily help efforts to evaluate cannabis’s medicinal 
value. Canada first established a legal medical 
cannabis framework in 2001, but the initial 
system proved impractical. “Health Canada 
put this very complicated programme into 
place, and only a few hundred people went 
through the process and got approved,” says 
Benedikt Fischer, a public health researcher at 
the Centre for Addiction and Mental Health 
in Toronto. A series of court decisions led to 
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the launch ofa reformed programme in April 
2014; now the government oversees cannabis 
production but not prescription access, which 
is at the discretion of individual doctors. “It’s 
completely wide open — in many ways, arbi- 
trary,’ says Fischer. 

If Canada’s system is arbitrary, the situa- 
tion in the United States could be described 
as bewildering. At one extreme are states 
such as California, where patients can obtain 
almost any strain of cannabis by finding a 
doctor willing to provide a recommenda- 
tion — merely a formality in some locales. By 
contrast, Minnesota offers restricted access 
to limited quantities of cannabis extract in 
liquid or pill form through state-run dispen- 
saries for a narrow range of disorders. So far, 
23 states and the District of Columbia have 
each come up with distinct approaches for 
providing medical cannabis — all in open 
defiance of federal law (see page S2). “There is 
no standardization,” says J. Michael Bostwick, 
a psychiatrist at the Mayo Clinic in Roches- 
ter, Minnesota. “I guess we could have ‘best 
practices’ for states defying federal law, but 
that seems like adding convoluted craziness 
to what is already a crazy situation” 

North of the border, the Canadian Medical 
Association and other professional organiza- 
tions oppose the medical marijuana policy 
because they say it is based on insufficient 
evidence. Pharmacologist Harold Kalant of 
the University of Toronto notes that many 
doctors resent being appointed gatekeepers 
for this unproven drug — but even as a critic 
of the Canadian system, he believes the medi- 
cal community has a responsibility to engage 
with this issue. “Who is better equipped to do 
this?” asks Kalant. “But this means they have 
to buckle down and do some reading and make 
themselves knowledgeable.” And that includes 
keeping abreast of the latest thinking about 
potential risks (see ‘Calculating the costs’). 


The problems associ- 

“Those who ated ) ee 

are very illcan 2 ™edical marijuana law 

et cannabis are in part attributable 

2 - and to the fact that this deci- 

‘dates sion, which requires 
don’t needa 


sophisticated scientific 
understanding, has 
been given to the public. 
“Tf the majority of voters are in favour of overall 
legalization, then let’s just make it consistent 
across states,” says Haney, “but don't have peo- 
ple vote on whether it’s a medicine” Among 
jurisdictions that limit medical marijuana to 
specific conditions, there is little consensus 
on the best approach, whereas those that have 
adopted more open-ended policies have cre- 
ated a milieu of easily-obtained permits and 
abundant dispensaries that are largely indis- 
tinguishable from outright legalization. “Peo- 
ple are saying that medical cannabis is just a 
scam and a cover for recreational use,” says 
Ware. “I worry that the message that there 


study.” 


BLAINE HARRINGTON III/ALAMY 


may be medical value gets lost” 

This also gives free rein for cannabis cow- 
boys to make outlandish claims. Abrams is 
no longer able to keep up with the e-mails 
and calls about ‘miracle cancer cures. “ma 
little frustrated,” he says. “There’s just no data 
to support this.” Other companies offer prod- 
ucts that piggyback on promising preliminary 
research — but with little clinical proof, regu- 
latory oversight or quality control. Hazekamp 
cites a recent FDA investigation of extracts 
from CBD-rich cannabis strains, which are 
being touted as a potential treatment for severe 
epilepsy, cancer and diabetes. “Almost none of 
the products complied with their labelling,” he 
says, “and some contained no CBD at all” 


HIGH TIME FOR RESPONSIBLE RESEARCH 

These legal experiments could yet bear scien- 
tific fruit. Colorado is using money raised from 
its recreational marijuana industry to support 
nine clinical research grants, several of which 
are for observational studies of real-world 
patient use. In Canada, Ware helped establish 
the Quebec Cannabis Registry earlier this year, 
which will track patient health and outcomes 
over four years. Fischer and colleagues are 
developing a similar programme to monitor 
patients in Ontario. 

There is a risk, however, that these efforts 
could be confounded by recreational users 
gaming the system. Hazekamp notes that in 
the Netherlands, where recreational cannabis 
is widely available, the proportion of people 
accessing medical marijuana is less than one- 
tenth that of some US states°. Canada’s medi- 
cal use is also booming, and Fischer likens 
the situation to Prohibition in the 1920s and 
early 1930s, when alcohol consumption in the 
United States was restricted to religious and 
medicinal uses. “Suddenly thousands of peo- 
ple needed medical alcohol — and it’s not like 
there was a sudden epidemic of ill people,” he 
says. Observational data are also no substitute 
for randomized controlled trials — and ironi- 
cally, medical marijuana access may impede 
recruitment for the trials needed to prove its 
value. “Those who are very ill can get cannabis 
anyway, and dont need a study,’ says Naftali. 

Veteran cannabis researchers are finding 
new opportunities to strengthen their case. 
Wilsey’s team is embarking on a neuropathic 
back-pain trial spanning 8 weeks, with more 
than 100 patients, while Abrams is working 
with NIDA‘s new cannabis strains to assess the 
combined effects of THC and CBD in modu- 
lating pain in sickle-cell anaemia — a condi- 
tion associated with especially debilitating and 
difficult-to-manage pain. Naftali is conduct- 
ing a 50-patient Crohn's disease trial that will 
closely examine physiological markers as well 
as patient-reported symptoms. “We'll be doing 
endoscopy before and after patients take can- 
nabis, to see whether there’s any real difference 
in the inflammation,’ she says. 

Funding opportunities are also attracting 
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CALCULATING THE COSTS 


Howrisky is weed? 


One of the big fears of creeping legalization 
of marijuana is that rates of use and 
addiction will soar; it is estimated that 
nearly 9% of users become dependent’. 
“It’s not the worst drug-use disorder, but 
it’s also not a good thing — and the relapse 
rates are very high,” says Margaret Haney, 
a neurobiologist at Columbia University in 
New York. She notes that dependence is 
particularly problematic among younger 
users, who are less motivated to quit than 
older people with families and careers. 
Some evidence links heavy adolescent 
use to lasting impairments in cognitive 
development. Pharmacologist Harold Kalant 
at the University of Toronto, Canada, cites 
a 38-year-long study from New Zealand?, 
showing that adolescents who use the drug 
experience potentially lifelong deficits in 
cognitive skills, including judgement and 
working memory. “These kids have much 
poorer school records, higher dropout rates 
and poorer employment prospects,” says 
Kalant. This was the first of only a few studies 
to investigate this relationship over time, 
so the results are open to interpretation. 
“It’s biologically plausible,’ says Haney. 
“But there are no data that definitively 
demonstrate that marijuana is causing this 
effect.” Indeed, a study published in August 
found no difference in mental and physical 
health among young adults who had used 
cannabis as adolescents’. However, the 
study was smaller and shorter than the New 
Zealand one and, although Haney says it was 


a new generation of researchers. Colorado 
research grant programme supports a placebo- 
controlled trial for PTSD in 76 veterans and 
a pain study that pits cannabis against oxyco- 
done. Cannabis growers are also keen to build 
their scientific case — Canadian company 
National Green Biomed, based in Vancouver, 
gave a Can$1 million (US$760,000) grant to 
researchers at the University of British Colum- 
bia to explore whether medical marijuana 
suppresses HIV infection. And Bedrocan has 
hired its first clinical trial coordinator, and is 
investigating the chemical composition of its 
plants in order to learn why patients might 
claim different benefits from different strains. 

The future of medical marijuana remains 
unclear — the money now pouring into this 
new industry could accelerate deregulation, or 
a backlash may lead to stricter controls. From 
Ware's perspective, the current situation may 
offer a fleeting chance for the research commu- 
nity to seize the reins (see page S9). “There's an 
opportunity for us to take a global leadership 
position — the world is crying out for some 


well conducted, she adds that its size makes 
it hard to identify real differences between 
users and non-users. “Those arguing 
marijuana might cause psychosis would 
probably not be dissuaded by such small 
numbers,” she says. 

One recent analysis showed that 
although adolescent use is higher in states 
that allow medical marijuana, passage of 
those laws did not lead to increased use — 
suggesting that pre-existing local attitudes 
may be the primary factor’®. Benedikt 
Fischer, a public health researcher at the 
Centre for Addiction and Mental Health in 
Toronto, sees the fear that cannabis is a 
danger to young people as an exaggerated 
product of ‘Reefer Madness’-style moral 
panic. “There are a lot of hockey-related 
brain injuries,” he says, “but nobody in 
this country would think about prohibiting 
hockey for young people.” He believes that 
other risks, such as increased frequency of 
cannabis-related traffic accidents, may be 
under-appreciated. 

Other critics see a double-standard 
for cannabis, given the extensive misuse 
of pharmaceuticals such as opioid 
painkillers. “There are risks, but those can 
be modulated by careful patient screening 
and public health messaging,” says pain 
specialist Mark Ware, at McGill University 
in Montreal, Canada, “and they should not 
stop progress in considering these drugs for 
middle-aged women with multiple sclerosis, 
or for men with HIV/AIDS.” MLE. 


direction in managing not just medical, but also 
recreational cannabis,’ he says. “This should 
stop being an issue for legal minds to wrestle 
over; it should be the scientists that are putting 
the evidence together that drive the policy’ = 


Michael Eisenstein is a freelance science 
writer based in Philadelphia, Pennsylvania. 
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CANNABIS 


4 BIG QUESTIONS 


How many types of 
cannabis are there? 


PS MAJOR HURDLES 


Cannabis seems to come in 
many forms, partly because of 
millennia of breeding by humans. 


_ Producers have allocated a 


‘folk taxonomy’ to strains with 
various medicinal and chemical 
properties, but the evolutionary 
relationship between species is 
unclear (see page S4). 


Without an accurate 
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As restrictions around 
cannabis research 
ease, scientists 

are exploring how 

the plant could be 


medically useful. Here 
are four of the hardest 
questions they face. 


BY JULIE GOULD 


Because cannabis is wind 
pollinated with separate male 
and female plants, controlling 
breeding can be difficult. There 
" is no organized seed bank and 
legal restrictions impede the 
collection of samples. Little 
government funding means 
there are few research labs. 


Researchers need to 
characterize plants for drug 
development, including 

by species or subspecies. 


taxonomy, there is no 
comparative base to work 
from and some varieties 
might become extinct. 


What are the 
medically useful 
compounds in 
cannabis and what 
diseases can they 
treat? 


than those found in 
cannabis? 


What are the best 
ways to deliver 
cannabis-based 
medicines? 


Of the 100 or so cannabinoids, 
tetrahydrocannabinol (THC) and 
cannabidiol have been the focus 


. of most research. Compounds 


such as the volatile terpenes 
also seem to have intriguing 
properties. All constituents might 
act alone or in combination. 


So far, drugs that target the 
endocannabinoid system 
have failed. Cannabidiol has 
an anti-schizophrenia effect 


“by increasing levels of the 


endocannabinoid anandamide, 
a type of neurotransmitter, 
suggesting that anandamide is 
the active molecule. 


Delivery through the lung by 
smoking or vaporizing facilitates 
fast absorption. Administration 


. that involves compounds passing 
"through the liver is tricky because 


this organ destroys cannabinoids. 
Only an oral, under-tongue spray 
has so far been approved. 


Can the 
endocannabinoid : 
system be targeted : 
bycompounds other ‘ 


but evidence is thin. To test 


* to patients with diabetes 


", and how quickly. The faster that 
' THC-containing medicines are 


Research has been 
undermined by a lingering 

» assumption that all 

:. cannabinoids share the 
properties of THC — and 
regulators apply the same 
restrictions to all of them (see 
page S6). 


Those who use medical 
marijuana claim that the herb 
can treat a range of ailments, 


these claims and create a 
pharmaceutical product, the 
active compounds need to be 
isolated and investigated. 


Endocannabinoids have 
localized actions and short half- 


Therapies based on endogenous 
molecules are a mainstay 

of medicine. The hormone 
insulin was successfully given 


lives, so their use as systemic 
therapies is challenging. 

~ Endocannabinoid receptors 
are found throughout the body, 
making it difficult to target 
specific physiological processes 
without side effects. 


shortly after its discovery. And 
cortisone-based medicines 
can be used to relieve pain and 
reduce swelling. 


Route of delivery determines how | 
much of a compound is absorbed | 
by different parts of the body, : 


Regulators require that 
cannabis-based drugs have 

» no psychoactive effects, so 

:, delivery of THC to the brain 

" needs to be controlled. Little is 
known about the metabolism of 
other cannabinoids and hence 
appropriate delivery routes. 


delivered to the brain, the more 
potential there is for psychoactive : 
side effects, as well as for abuse. 


Julie Gould is editor of NatureJobs in London. 
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PHARMACOLOGY 


NEWS & VIEWS 


Cannabis in neurology—a potted 


review 


Richard Hosking and John Zajicek 


Discovery of the endogenous cannabinoid signalling system unleashed 
substantial new research into several neurological conditions. A recent 
systematic review suggests that medical marijuana can improve a 
number of symptoms—particularly spasticity—in multiple sclerosis, 
but cannabinoids can have adverse psychological effects and their 
comparative effectiveness is unknown. 


Hosking, R. & Zajicek, J. Nat. Rev. Neurol. 10, 429-430 (2014); published online 8 July 2014; 


doi:10.1038/nrneurol.2014.122 


Cannabis attracts attention. The war on 
drugs and controversy over legalization 
contrasts with renewed interest in alter- 
native therapies and anecdotal reports of 
patient benefit. A growing understanding 
of cannabis pharmacology has revealed 
important molecular mechanisms that 
relate to neural communication and inflam- 
mation, and several licensed preparations of 
medical cannabis are now available (Box 1). 
A recent systematic review published in 
Neurology by Barbara Koppel and colleagues 
provides timely guidelines for the use of 
medical marijuana in selected neurological 
disorders.! The evidence summarized in 
the review by Koppel et al. supports a role 
for cannabis in the symptomatic treatment 
of multiple sclerosis (MS), but its efficacy 
in Huntington disease (HD), Parkinson 
disease (PD) and epilepsy is unknown. 


GG ...cannabinoids should be 
investigated ... using the same 
evidence-based criteria as for all 
other drugs 99 


Cannabis is thought to derive its name 
from ancient Sanskrit and Hebrew texts, 
where it means ‘fragrant cane’, and the 
plant has been of medicinal interest for 
millennia. The first clinical study involv- 
ing cannabis was conducted in 19"-century 
Calcutta by Irish physician Sir William 
O’Shaughnessy, who introduced Indian 
hemp to Victorian medicine.? However, 
it was not until the 1960s that Raphael 


Mechoulam and colleagues in Israel identi- 
fied the major psychoactive cannabinoid 
delta-9-tetrahydrocannabinol (A9-THC), 
which led to the discovery of an extensive 
endogenous lipid signalling system.’ 

There are currently over 100 known can- 
nabinoids that have diverse effects at both 
cannabinoid and non-cannabinoid recep- 
tors. Cannabinoid receptor 1 (CB1) is the 
most common G-protein-coupled receptor 
in the CNS. High densities of CB1 within the 
cerebellum, basal ganglia, hippocampus and 
cerebral cortex correlate with the capacity 
of cannabis to produce motor and cognitive 
impairment. This psychoactivity—a prop- 
erty that not all cannabinoids possess—is 
largely mediated by A9-THC. Cannabinoid 
receptor 2 (CB2) mRNA is found within 
cells of the immune system including 
lymphocytes, monocytes and microglia, 
although current problems with antibody 
specificity complicate receptor localiza- 
tion. The major ligands for these recep- 
tors are endogenous cannabinoids, such as 
anandamide and 2-arachidonoylglycerol, 
which are derived from arachidonic acid 
and demonstrate marked overlap with 
prostaglandin signalling.‘ The actions of 
these ‘endocannabinoids’ are complex, as 
are those of plant-derived cannabinoids, 
which, in addition to receptor stimulation, 
might have antioxidant properties. 

Cannabinoid receptors are present in 
all major pain pathways, and the analge- 
sic effects of cannabinoids are likely to 
be mediated by postsynaptic retrograde 
inhibition of neurotransmission via CB1. 


An alternative mechanism might involve 
CB2 activation on microglia and peripheral 
inflammatory cells.* Endocannabinoids are 
hydrolysed by several enzymes, including 
fatty acid amide hydrolase (FAAH), mono- 
acylglycerol lipase (MAGL) and diacyl- 
glycerol lipase (DAGL). Most information 
on these mechanisms derives from animal 
studies, but while animal models have 
provided many of the important tools for 
dissecting complex immunological pro- 
cesses, they rarely replicate the pathological 
features of human disease.° 

Koppel and colleagues reviewed the 
full text of 64 papers from a total of 1,730 
abstracts published between January 1948 
and November 2013.' This search yielded 
34 studies that met the authors’ inclusion 
criteria. Eight of the included studies were 
rated as Class I evidence according to the 
American Academy of Neurology classifi- 
cation for therapeutic articles. The authors 


Box 1 | Cannabinoids 


Cannabis sativa 
The botanical name for the hemp plant from 
which cannabis is obtained 


Marijuana 
A synonym for cannabis that also refers to 
the dried flowers and leaves 


Cannabinoids 
Plant-derived compounds and endogenous 
or synthetic analogues 


Delta-9-tetrahydrocannabinol (A9-THC) 

The main plant-derived psychoactive 
cannabinoid; the relative amount of A9-THC 
in a preparation determines the extent of 
psychological effects 


Cannabidiol (CBD) 
The main non-psychoactive plant-derived 
cannabinoid 


Endocannabinoids 
Endogenous lipid signalling molecules that 
are widespread throughout the body 


Oral cannabis extract 
Extracts such as Cannador, a capsule with a 
defined THC:CBD ratio (2.50:1.25 mg) 


Dronabinol (Marinol) 

Oral synthetic A9-THC. 

Nabilone (Cesamet) 

Oral synthetic cannabinoid similar to A9-THC 
Nabiximols (Sativex) 


Herbal oromucosal spray with a defined 
THC:CBD ratio (2.7:2.5mg) 
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addressed several questions, including the 
efficacy of cannabinoids for treating MS 
symptoms of spasticity, central pain, painful 
spasms, bladder dysfunction and involun- 
tary movements including tremor. They 
also assessed the use of cannabinoids in HD, 
levodopa-induced dyskinesias in PD, cervi- 
cal dystonia, the tics of Tourette syndrome, 
and seizure frequency in epilepsy. 

The main finding of Koppel and co- 
workers was that oral cannabis extract (see 
Box 1) is effective in treating spasticity, central 
pain and painful spasms in patients with MS. 
Nabiximols and THC are also likely to be 
effective for these indications. Nabiximols 
is probably effective in reducing urinary 
frequency, but no formulation was felt to 
improve MS-related tremor. Oral cannabis 
extract was found to be probably ineffective 
in treating levodopa-induced dyskinesias in 
PD; no discernible effect was found for any 
cannabinoid on the remaining neurological 
disorders. Koppel et al. point out that the risk 
of serious adverse psychological effects was 
nearly 1%. The comparative effectiveness of 
medical marijuana relative to other therapies 
for these indications is unknown. 

We welcome evidence-based guidelines 
for cannabinoid use in neurological disease, 
and Koppel et al. have provided an excellent 
systematic review of current clinical research. 
The limitations of the present study reflect 
those of the original studies. These include 
problems with subjective patient-centred 
rating scales; objective clinical scales with 
potential interobserver variability; the risk 
of unblinding owing to the psychoactive 
properties of the drugs; and often-substantial 
placebo effects that make treatment effects 
difficult to establish. 

An additional issue, not specific to canna- 
bis preparations, is the difficulty of identi- 
fying responders within heterogeneous 
populations, which has led to novel trial 
designs and will continue to tax medical 
ingenuity. Importantly, the studies included 
in the Koppel et al. review used a wide range 
of formulations with different cannabinoid 
content, methods of administration and 
dose. Cannabinoids are intensely lipophilic 
and readily cross the blood-brain barrier, 
but the absorption and bioavailability of 
preparations are highly variable, partly due 
to first-pass metabolism (metabolization 
before the drug reaches the systemic circu- 
lation), which oromucosal administration 
attempts to reduce. 


Titration speed can affect the develop- 
ment of unwanted psychoactive effects, 
but the current available evidence suggests 
that these effects resolve on dose reduc- 
tion or drug cessation. Smoked marijuana 
causes the most rapid rise in plasma THC 
concentration, which then quickly falls as 
a result of tissue distribution.* However, 
combustion alters cannabinoid activity, 
and the attendant risks of inhaled carcino- 
gens are an additional cause for concern. 
Furthermore, a recent study shows that 
patients with MS who smoke cannabis have 
greater cognitive impairment than those 
who do not use this drug.° 

Beyond symptom amelioration, the 
potential of cannabinoids as neuroprotec- 
tive agents has created a great deal of 
interest. Although the overall results of a 
recent dronabinol study of neuroprotec- 
tion in progressive MS were negative, sub- 
group analysis did suggest a possible early 
treatment effect in less-disabled patients.’ 
Polymorphisms in the gene encoding CB1 
could influence the inflammatory neuro- 
degenerative process in MS,* and receptor 
downregulation after continued exposure to 
THC might mitigate the effects of this drug 
in certain brain regions.’ In patients with 
HD, loss of CB1 from the striatum could 
be an important pathogenic factor. Data 
from animal models suggest that abnor- 
mal huntingtin protein indirectly reduces 
CB1 expression in striatal neurons, leading 
to increased excitotoxicity and decreased 
levels of brain-derived neurotrophic factor. 
Early treatment with CB1 agonists might 
prevent these changes.’ 

Cannabidiol lacks psychoactivity and has 
shown promise as a neuroprotective agent 
in preclinical studies. An alternative strategy 
to prevent adverse psychological reactions 
that could also be of use in neuroprotec- 
tive studies is to increase endocannabinoid 
signalling or ‘tone’ using inhibitors of the 
constitutive hydrolytic enzymes FAAH, 
MAGL and DAGL. In fact, this could be 
one mechanism by which cannabidiol acts, 
because its general activity at cannabinoid 
receptors is low. 

Cannabis research has paralleled advances 
in opioid pharmacology, whereby a psycho- 
active plant extract has led to the discovery 
of endogenous signalling systems with thera- 
peutic relevance. We agree that cannabin- 
oids should be investigated and prescribed 
using the same evidence-based criteria as 


for all other drugs. Systematic assessment of 
this evidence, such as the present review by 
Koppel and co-workers, can greatly aid clini- 
cal decisions. The current data, however, are 
limited, and we look forward to future ran- 
domized controlled trials investigating the 
symptom-alleviating and neuroprotective 
potential of cannabinoids. 
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@© THE ENDOCANNABINOID SYSTEM — TIMELINE 


Early phytocannabinoid chemistry 
to endocannabinoids and beyond 


Raphael Mechoulam, Lumir O. Hanus, Roger Pertwee and 


Allyn C. Howlett 


Abstract | Isolation and structure elucidation of most of the major cannabinoid 
constituents — including A*-tetrahydrocannabinol (A°-THC), which is the principal 
psychoactive molecule in Cannabis sativa — was achieved in the 1960s and 1970s. 
It was followed by the identification of two cannabinoid receptors in the 1980s and 
the early 1990s and by the identification of the endocannabinoids shortly 
thereafter. There have since been considerable advances in our understanding of 
the endocannabinoid system and its function in the brain, which reveal potential 
therapeutic targets for a wide range of brain disorders. 


The plant Cannabis sativa and its many 
preparations (for example, marijuana, 
hashish, bhang and ganja) have been used 
for millennia for recreation (and at times for 
the achievement of religious ecstasy) as well 
as in medicine. In ancient China, cannabis 
was prescribed (together with other plants, 
as is customary in Chinese medicine) for 
numerous diseases, but it was noted that 
when taken in excess it could lead to ‘see- 
ing devils. In Assyria (about 800 Bc), it was 
named both gan-zi-gun-nu (‘the drug that 
takes away the mind’) and azallu (when 
used as a therapeutic). In India, ancient 
Persia and medieval Arab societies, can- 
nabis use proceeded along these two diver- 
gent routes’. In many countries, hemp — a 
strain of Cannabis sativa that does not cause 
psychoactivity — was grown for its durable 
fibres. Our present-day society follows a 
long tradition of recreational, industrial and 
medical cannabis use. 


Cannabinoid discovery — early history 
The behavioural effects of cannabis, in sev- 
eral animal species as well as in humans, 
were observed in the mid-nineteenth 
century? (FIG. 1). These experimental obser- 
vations led to the first attempts to isolate 
the active constituents of the plant, as had 
already been done with other plants that 


had known neuropharmacological activ- 
ity — for example, the isolation of mor- 
phine. A prize was even awarded in 1855 
for the ‘successful’ accomplishment of this 
project. However, the first isolation of a 
plant cannabinoid — named cannabinol 
(CBN) — was not achieved until the end 
of the nineteenth century. Its structure was 
elucidated much later, in the 1930s, by the 
groups of Cahn and Todd in the United 
Kingdom and by Adams in the United 
States, when a further component, can- 
nabidiol (CBD), was isolated; however, its 
structure could not be elucidated at that 
time. Although considerable effort was 
invested on the isolation and the elucida- 
tion of the structure of the main psycho- 
active constituents of cannabis, this goal 
was not reached at that time. A synthetic 
compound, A®!°:-tetrahydrocannabinol 
(A%®!-THC), showed pharmacological 
activity that paralleled the activity of canna- 
bis extracts. Therefore, it was assumed that 
A®1°THC was chemically related to the 
active compounds of the plant (FIG. 2). Much 
of the early research in this area was done 
using synthetic A®!°*-THC, which is now 
known to be considerably less potent than 
the actual natural product. The chemical 
and pharmacological work that was carried 
out until the mid 1940s has been reviewed 


elsewhere**. Some A®!°*-THC analogues 
were even tested in humans. In light of 
recent media reports about the action 

of cannabinoids in paediatric epilepsy, it 
is of interest to note that a derivative of syn- 
thetic A®*'*-THC (at doses of 1.2-1.8 mg 
daily) was administered to a small num- 
ber of children with epilepsy and showed 
positive results. Historical cannabis use in 
medicine over the ages and early chemical 
investigations are reviewed in REF. 1. 

The reasons for the lack of progress 
were mostly technical. We now know that 
cannabinoids are present in cannabis as a 
mixture of many closely related constitu- 
ents — over 100 — which were difficult to 
separate using the methods that were avail- 
able in the nineteenth and early twentieth 
centuries. As the active constituents of can- 
nabis were not available in pure form, there 
was very little biological or clinical work 
done in this area from the late 1940s until 
the mid 1960s. 

By the 1960s, chromatography meth- 
ods were well developed for the isolation 
of pure compounds from mixtures and 
the availability of novel spectrometric 
methods meant that the elucidation of the 
structure of these compounds was possible. 
Indeed, many cannabinoids were isolated, 
including A°-THC, which was reported 
by Gaoni and Mechoulam in 1964 (REF. 6) 
(FIG. 2). Their structures were mainly elu- 
cidated using NMR, which was a modern 
method at the time. Several total syntheses 
of these compounds have been reported 
and most cannabinoids are now available 
as both natural and synthetic products. 
The chemical work until the mid 1970s is 
reviewed in REF. 7. 

The next step in cannabinoid research was 
the elucidation of the metabolism of A?-THC 
and later of CBD. The major metabolic path- 
way of A’-THC is hydroxylation, which leads 
to the formation of an active metabolite, fol- 
lowed by its further oxidation to an inactive 
acid, which then binds to a sugar molecule. 
These acid-derived metabolites are stored in 
fatty tissues and are slowly released*. Indeed, 
the major final A’-THC metabolite (a car- 
boxylic acid that is present as a glucuronide) 
can be detected in human urine for several 
weeks after cannabis use (FIG. 2). 
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O'Shaughnessy 
investigates medicinal 
use of cannabis in India? 


Isolation of first plant 
cannabinoid, cannabinol’™* 


Cannabinol structure 
elucidation?+1 


Cannabidiol 
isolation?* 


Synthesis and evaluation 
of A6a.10a_T HC3-5 


Early pharmacological 
investigations? 


Cannabidiol structure 
elucidation! 


A®-THC isolation and 
structure elucidation® 


Isolation and identification 
of additional cannabinoids’ 


Ring immobility and 
tetrad assays?!° 


Research on cannabinoid 
pharmacology and metabolism®?> 


Discovery of CB1 (REF. 27) 


Cloning of CB1 
(REF. 29) 


Isolation and structure 
elucidation of anandamide*” 


Discovery of 
CB2 (REF. 31) 


Discovery of SR-141716A, 
the first CB1 antagonist*° 


Isolation and structure 
elucidation of 2-AG?21°” 


Cloning of the first 
endocannabinoid-degrading 
enzyme, FAAH?% 


Discovery of SR-144528, the first 
CB2 antagonist 


Discovery and evaluation of 
endocannabinoid-like brain 
components®*°°!"1, discovery and 
evaulation of functions of FAAH and MAGL 
inhibitors?#63112113, cell biology! and 
neuroscience studies carried out!*1° and 
clinical trials initiated1011278 


Anandamide 
activates vanilloid 
receptors 109110 


Discovery of retrograde 
signalling by 
endocannabinoids 


Cloning of the first endocannabinoid- 
biosynthesizing enzyme!? 


Figure 1| Cannabinoid and endocannabinoid research — a timeline. Almost all early research 
was devoted to clarification of cannabinoid chemistry?*1*!%, and pharmacology was mainly done 
using synthetic compounds’. Following the isolation and structure elucidation of the plant can- 
nabinoids, particularly of cannabidiol’® and of A9-tetrahydrocannabinol (A®’-THC)®, pharmaco- 
logical and physiological work was initiated®*», The identification of cannabinoid receptors**794, 
of endogenous cannabinoids****’” and of receptor antagonists*°** made possible extensive phar- 
macological and neurobiological research leading to cloning of the anandamide-degrading 
enzyme fatty acid amide hydrolase (FAAH)?%, the discovery of retrograde signaling by 
2-arachidonoyl glycerol (2-AG)*, the discovery of allosteric sites on cannabinoid receptor 1 
(CB1)*?, the discovery that endocannabinoids bind to receptors other than CB1 and CB2 
(REFS 109-111), the discovery and evaluation of endocannabinoid-like molecules in the brain®>°° 
and the discovery and function of inhibitors of the endocannabinoid-degrading enzymes!*", 
Cell biology’ and neuroscience’*?”° investigations were also carried out, and clinical trials were 
initiated*°47.48_ Cloning of DAG lipase was also reported?’ 


Early neuropharmacology 

The advances in the chemistry of plant and 
synthetic cannabinoids led to renewed inter- 
est in their neuropharmacology. Loewe° had 
found that cannabis extracts (presumably 
containing high levels of what is now known 
to be A®-THC and additional phytocan- 
nabinoids) can induce catalepsy in mice and 
that CBN can also produce this effect, albeit 
much less potently than the impure THC 
isolated from the resin. It was these findings 
that prompted the development by Pertwee? 
in 1972 ofa quantitative in vivo assay for 
psychotropic cannabinoids, known as the 
ring test, in which the proportion of time 
that a mouse placed across an elevated hori- 
zontal ring remains immobile or cataleptic is 
measured over a 5 minute period’. Martin” 
later used this assay, along with three other 
bioassays, in what came to be known as the 
‘mouse tetrad assay’"’. These other assays 
provide measures of cannabinoid-induced 
hypokinesia, hypothermia and antinociception 
in mice, using a tail flick or hot plate test. 
The mouse tetrad assay is a useful in vivo 
screen for psychotropic cannabinoids, all 

of which, in contrast to many other types 

of drugs, generally show similar potency in 
all four of these bioassays. It was also dis- 
covered in the 1940s that cannabinoids can 
elicit central excitant activity in rabbits and 
mice and corneal arreflexia in rabbits, and 
that some phytocannabinoids, particularly 
CBD, can prolong barbiturate-induced sleep 
by a mechanism that was subsequently dis- 
covered to involve the inhibition of certain 
cytochrome P450 (CYP) enzymes”. 

Following its identification as the main 
psychoactive constituent of cannabis, 
A°®-THC attracted particular attention"; for 
example, results obtained from several inves- 
tigations on humans indicated that when 
A’-THC was taken orally or intravenously 
or when it was inhaled in smoke, it showed 
substantial potency at producing psycho- 
logical changes similar to those reportedly 
experienced in response to recreationally 
consumed cannabis". A few other phyto- 
cannabinoids, such as CBN, were found to 
induce cannabis-like effects in humans with 
low potency (an exception being A*-THC 
but there is usually very little A’- THC in 
cannabis)". 

It is noteworthy that one synthetic 
analogue of A°-THC, nabilone (Cesamet; 
Valeant Pharmaceuticals North America) 
was approved in 1981 as a medicine for the 
suppression of the nausea and vomiting 
that is produced by chemotherapy”. Syn- 
thetic A’-THC, dronabinol (Marinol; Solvay 
Pharmaceuticals, Inc) subsequently entered 
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Figure 2 | A major metabolic pathway of A®-THC and the structures of some plant and syn- 
thetic cannabinoids. a| The major psychoactive cannabis constituent, A’-tetrahydrocannabinol 
(A°-THC), is first metabolized by enzymatic hydroxylation to produce psychoactive 11-hydroxy- 
A®-THC (11-OH-A®-THC) and then by enzymatic oxidation to non-psychoactive A®-THC-11-oic 
acid, which is stored in fatty tissues as a glucuronide and is slowly released. The glucuronide may 
be detected in the urine for several weeks after a single cannabis use. b | The structures of some plant 
and synthetic cannabinoids. A®-THC, the plant constituents cannabinol and A’-THC, and synthetic 
A®2-THC and CP-55940 cause cannabis-type psychoactivity, wherease cannabidiol does not. 


the clinic as a licensed medicine, in 1985 
as an antiemetic and in 1992 as an appetite 
stimulant’. Claims from patients that can- 
nabis can ameliorate unwanted symptoms 
of multiple sclerosis also encouraged the 
development of the cannabis-based medi- 
cine naviximols” (Sativex; GWPharma), 
which contains both A®-THC and the non- 
psychoactive CBD; this was first licensed as 
a medicine in 2005 in Canada for the relief 
of pain experienced by adult patients suf- 
fering from multiple sclerosis or advanced 
cancer, and subsequently as a medicine to 
ameliorate spasticity caused by multiple 
sclerosis”. 


Discovery of the cannabinoid receptors 
Although a considerable amount of phar- 
macological work was done on the activity 
of A°-THC, its mechanism of action was 
not elucidated for more than 20 years after 
its identification. Indeed, it was originally 
thought that the mode of action of A’-THC 
was nonspecific in nature and that it might 
involve interactions with lipid membranes. 
However, although the stereospecificity of 
the action of A’-THC and related synthetic 


cannabinoids'*", as well as pharmacological 
studies, in humans and animals had sug- 
gested a putative cannabinoid receptor", 
it was not until the 1980s that evidence for a 
protein receptor was sought. 

As the family of known G proteins 
expanded in the late 1970s and early 1980s, 
so did the list of receptors for hormones 
and neurotransmitters to which they could 
couple. Agonists of opioid, muscarinic, cho- 
linergic and a-adrenergic receptors resulted 
in inhibition of G-stimulated adenylyl 
cyclase'”"”’, and functional homology with 
these neuromodulators led to the discov- 
ery that cannabinoids also inhibited this 
enzyme” by a pertussis toxin-sensitive 
mechanism”. This clearly indicated that the 
cannabinoid receptor was a G protein-coupled 
receptor (GPCR). 

From the structure—activity relationship 
(SAR) established using cannabimimetic 
compounds from Pfizer Central Research, the 
Howlett laboratory identified CP55940 (FIG. 2) 
as a highly potent cannabinoid analogue 
and, in 1988, reported the determination and 
characterization of a cannabinoid receptor 
from the brain for which the criteria for a 
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high-affinity, stereoselective receptor in brain 
tissue had been fulfilled. Competitive dis- 
placement of [7H]CP55940 from its target in 
rat brain membranes by cannabinoid agonists 
was enantioselective and followed the order 
of potency for both G.-mediated inhibition of 
adenylyl cyclase as well as antinociception 

in several rodent models”. Later, signal 
transduction assays were used to ultimately 
deorphanize a 7-transmembrane receptor 
now known to be the cannabinoid receptor 1 
(CB1; also known as CNR1)**”’. 


Discovery of endocannabinoids and CB2 
Receptors are mostly activated by endog- 
enous molecules, and therefore, there was 

a strong reason to look for endogenous 
cannabinoids. As A°-THC and its related 
compounds that bind to the CB1 are lipids, 
it was reasonable to assume that any endog- 
enous cannabinoids would also be lipids. In 
order to isolate putative endogenous can- 
nabinoid compounds, the ability of porcine 
brain extracts to displace a novel, highly 
potent radiolabelled cannabinoid probe, 
[SH]HU-243, bound to CB1 was tested in 
the Mechoulam laboratory. The fractions 
that inhibited the binding of [*H] HU-243 to 
the cannabinoid receptor were purified by a 
series of chromatographies, which ultimately 
led to the generation of a minute amount of 
a single compound, an amide of arachidonic 
acid — arachidonoyl ethanolamide — which 
was named anandamide; this was the first 
endocannabinoid to be identified*®. The struc- 
ture of anandamide (FIG. 3) was established 
by mass spectrometry, NMR spectroscopy 
and by its synthesis*. Anandamide was 
found to have inhibitory activity that was 
equivalent to that of A’-THC and was sub- 
sequently shown to have cannabimimetic 
activity as it inhibited the twitch response of 
isolated mouse vasa deferentia”. 

In the meantime, a second receptor, CB2 
(also known as CNR2), had been identified 
by sequence homology” and was presumed 
to be mainly present in the periphery; 
therefore, a search for a ‘peripheral’ endog- 
enous agonist was initiated. Using the same 
techniques that were used to isolate anan- 
damide, it was possible to isolate an ester of 
arachidonic acid — 2-arachidonoyl glycerol 
(2-AG)” — from canine intestines (FIG. 1). 
This compound was unexpectedly found 
to bind CB1 and CB2 and to inhibit ade- 
nylyl cyclase with a potency similar to that 
of A®-THC. 2-AG also shared the ability of 
A°®-THC and anandamide to inhibit electri- 
cally evoked contractions of isolated mouse 
vasa deferentia; however, it was less potent 
than A®-THC™. Following administration 
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Figure 3 | Structures of the main endocannabi- 
noids, anandamide and 2-AG, which bind to 
CB1 and CB2 endocannabinoid receptors. 
Arachidonoyl ethanolamide (also known as anan- 
damide) and 2-arachidonoyl glycerol (2-AG) are 
hydrolysed to arachidonic acid by the enzymes 
fatty acid amide hydrolase (FAAH) and monoacyl- 
glycerol lipase (MAGL), respectively. Blocking 
these enzymes with various synthetic com- 
pounds leads to increased levels of these 
endocannabinoids. 


to mice, both anandamide and 2-AG caused 
the typical tetrad of effects produced by 
A°’-THC: antinociception, immobility, reduc- 
tion of spontaneous activity and lowering of 
rectal temperature. Although a few additional 
endocannabinoids have been reported, none 
of them has been confirmed as a natural 
endocannabinoid. 

Anandamide is a partial agonist for CB1 
and CB2 and shows less relative intrinsic 
activity (also known as relative intrinsic 
efficacy) and affinity for CB2 than for CB1. 
2-AG shows greater potency and efficacy 
than anandamide as a CB1 agonist and 
greater potency than anandamide as a CB2 
agonist**. In addition, it has been found 
that both endocannabinoids interact with 
certain non-CB1 and non-CB2 receptors 
and ion channels*’. In the past few years, 
lipoxin A4 and a new family of peptides 
(known as pepcans) have been reported to 
target CB1 as allosteric modulators’ and the 
peptide hemopressin, which is a putative 
brain constituent, has been found to lower 
pain via action on a cannabinoid receptor”. 

Synthesis of cannabinoid analogues that 
have high affinity and specificity for CB2 
was achieved in the mid to late 1990s*”** and 
led to the discovery of the role of CB2 in 
immunosuppression, neuroprotection and 
neuropathic and inflammatory pain. This 
consequently led to considerable interest in 
developing and investigating CB2-selective 
agonists*, 

Both anandamide and 2-AG are syn- 
thesized on demand, often in response to 
increased concentration of intracellular 


calcium”, and it is now generally accepted 
that one important role of these endocan- 
nabinoids, although possibly only of 2-AG, is 
to function as retrograde synaptic messengers 
that can prevent the development of exces- 
sive neuronal activity in the central nerv- 

ous system and thereby contribute to the 
maintenance of homeostasis in both health 
and disease’. Thus, there is good evidence 
that neurotransmitters, such as glutamate, 
produce postsynaptic increases in the concen- 
tration of intracellular calcium in a manner 
that can induce postsynaptic biosynthesis 
and release of anandamide or 2-AG into the 
synapse. In turn, this induces subsequent 
endocannabinoid-induced activation of 
presynaptic CB1, which causes an inhibi- 
tion of the neuronal release of glutamate, 
y-aminobutyric acid or other neurotransmit- 
ters in brain areas that include the cerebral 
cortex, hippocampus, ventral tegmental area, 
substantia nigra, hypothalamus and cerebel- 
lum****, There is also evidence that, when 
produced postsynaptically in response to 

the activation of postsynaptic metabotropic 
glutamate receptor 5 (MGLURS) , ananda- 
mide activates postsynaptic transient receptor 
potential cation channel subfamily V member 1 
(TRPV1) channels”. It is also noteworthy 
that results obtained from in vivo experi- 
ments with rats suggest that retrograde 2-AG 
signalling that is triggered by the activation of 
MGLURS can suppress pain sensitivity”. The 
endocannabinoid retrograde transport mech- 
anism and modulation of synaptic transmis- 
sion have not yet been fully elucidated**. 


Search for antagonist ligands 

The holy grail for cannabinoid synthetic 
chemists was an antagonist that could block 
the effects of A°-THC. It seems quite unu- 
sual that no natural product or structurally 
related analogue emerged to block the can- 
nabinoid receptors. Before the advent of 
gene knockout techniques, it was difficult to 
establish whether a pharmacological effect 
was mediated by a receptor if a selective 
antagonist for that receptor had not been 
developed. Thus, one can imagine the excite- 
ment generated at an International Can- 
nabinoid Research Society meeting in 1993 
when a team of researchers from the French 
pharmaceutical company Sanofi Recherche 
announced their discovery of an antagonist 
for CB1, SR141716A™. This compound was 
radiolabelled to investigate receptor pharma- 
cology” and was soon modified to develop 
the first ligands for in vivo imaging”. The 
discovery of an antagonist (SR141716A), 
which was in fact subsequently identified 

as an inverse agonist, helped to characterize 


additional cellular signalling pathways for 
CB1 (REFS 50, 51, 53-55). More importantly, 
an antagonist could finally be used to iden- 
tify animal behaviours that were truly due to 
CB1 activation’. Indeed, the syndrome of 
‘dependence’ on cannabinoid agonists was 
first shown in an animal model after pre- 
cipitated withdrawal using SR141716A*®. 
Within a short period of time, industrial 
laboratories and academic research groups 
reported the synthesis of additional CB1 
antagonists and inverse agonists. 

The first CB2-selective antagonists 
AM630 (also known as iodopravado- 
line) and SR144528 emerged in the mid 
1990s‘ and increased the ability to discern 
novel actions that could be attributed to 
CB2, including actions observed in liver 
Kupfer cells”, microglial cells and astro- 
cytes®® and in the gastrointestinal system”, 
among others. Since that time, there has 
been considerable progress towards the 
development of highly selective and potent 
CB2 antagonists*!”’. 

SR141716A (also known as rimonabant) 
is used therapeutically for the treatment of 
obesity-related metabolic syndrome compo- 
nents, including dyslipidaemia and diabe- 
tes”7*, SR141716A was marketed in Europe 
but failed to gain approval from the US Food 
and Drug Administration. As might be pre- 
dicted, a drug that blocks CB1 neuromodula- 
tion at synapses for the major stimulatory (in 
the case of glutamate) and inhibitory (in the 
case of GABA) transmitters throughout 
the brain would be likely to produce multi- 
ple ‘off-target’ effects. One such side effect, 
which was reported in 2009, was an increase 
in reported signs of depression in vulnerable 
individuals treated with SR141716A””. It 
could be argued that the benefit to risk ratio 
in a morbidly obese patient population might 
mitigate the concerns about depression. 
However, the drug was withdrawn from the 
market and similar analogues from other 
pharmaceutical companies were taken out 
of the development pipeline. Nevertheless, 
the development of SR141716A by Sanofi- 
Aventis can be considered to be a major 
contributor to our understanding that CB1 
is present and functional in tissues such as 
adipose, liver and pancreas under pathologi- 
cal conditions of high-fat diet or obesity”. 
This new understanding of the role of CB1 in 
metabolic regulation has inspired the search 
for novel antagonists that fail to gain access to 
the brain”. An alternative clinical strategy 
would be to screen for individuals who might 
be most susceptible to the limbic effects of 
CB1 antagonists before selecting a treatment 
modality®. 
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Endocannabinoid neuropharmacology 
The discovery that anandamide and 2-AG 
are endocannabinoids prompted research to 
identify the biochemical processes that are 
responsible for both their biosynthesis and 
their metabolism. This research showed that 
these two endocannabinoids are synthesized 
‘on demand’ rather than stored, and it iden- 
tified biosynthetic and metabolic pathways 
for both of them®!**. Thus, it has been dis- 
covered that 2-AG is formed from diacylg- 
lycerol (DAG) ina process that is catalysed 
by sn1-specific DAG lipase-a and lipase-f, 
and that the main biosynthetic pathway 

for anandamide involves the formation of 
N-arachidonoyl phosphatidylethanolamine 
(NAPE) from phosphatidylethanolamine 
and phosphatidylcholine, which is catalysed 
by an as yet uncharacterized calcium- 
dependent transacylase enzyme. This is 
then followed by the conversion of NAPE to 
anandamide in a single step that is catalysed 
by NAPE-selective phospholipase D and/ 

or in two or three steps that are catalysed 

by other enzymes. It has also been found 


that, following their release, anandamide 
and 2-AG are mainly metabolized to ara- 
chidonic acid, the major metabolizing 
enzymes being fatty acid amide hydrolase 
(FAAH) for anandamide and mono- 
acylglycerol lipase (MAGL) for 2-AG*"*’, 
Other endocannabinoid-metabolizing 
enzymes include FAAH-2 for ananda- 
mide, a,b-hydrolase domain-containing 6 
(ABDH6) and ABDH12 for 2-AG, and cyto- 
chrome P450 enzymes, lipoxygenases and 
cyclooxygenase 2 for both of these endocan- 
nabinoids*'’, The physiological relevance 
of the lipoxygenase and cyclooxygenase 
derivatives of anandamide and 2-AG is not 
yet clear. It is also noteworthy that ananda- 
mide and 2-AG can undergo cellular uptake 
following their release, although whether 
this process is mediated by a transporter is 
currently unclear*'*’. 

It is now recognized that, although engi- 
neering exogenous cannabinoids provided 
insights into receptor usage and linked func- 
tional events, the intracellular and extracel- 
lular actions and fate of endocannabinoids 


Glossary 


Affinity 

The potency with which a compound binds to a particular 
receptor; the higher the affinity of the compound, the 
lower the concentration at which it achieves a given level of 
receptor occupancy. 


Agonists 

Compounds that can activate pharmacological receptors; a 
full agonist is more potent than a partial agonist and so 
usually produces a greater maximum functional response. 


Allosteric modulators 

Drugs that can act on an allosteric site of a receptor to 
increase or to reduce the ability of an agonist or an inverse 
agonist to induce a functional response when it targets a 
different (orthosteric) site on the same receptor. 


Antagonist 

A compound that can bind to, but cannot activate, a 
receptor by targeting its orthosteric site and that can 
therefore prevent both drug-induced agonism and 
drug-induced inverse agonism at this receptor. 


Antinociception 
Another term for pain relief. 


Apoptosis 
A process of programmed cell death that usually has 
advantageous consequences. 


Catalepsy 
A condition that is characterized by immobility and 
muscular rigidity. 


Endocannabinoid 

An endogenous compound that can directly activate or 
block cannabinoid CB1 and/or CB2 or that can act as a 
positive or negative allosteric modulator to increase or to 
reduce responses of CB1 and/or CB2 to direct agonists or 
inverse agonists. 


G protein-coupled receptor 

(GPCR). A seven-transmembrane domain receptor that 
induces G-protein-mediated activation of intracellular 
signal transduction pathways when occupied by an agonist. 


Hashish 
A cannabis-derived preparation that consists mostly of 
dried cannabis resin. 


Hypokinesia 
A condition that is characterized by decreased bodily 
movement. 


Inverse agonist 

A compound that binds to a receptor in a manner that 
induces a pharmacological response opposite to the 
response that is induced by an agonist for the same 
receptor. 


Relative intrinsic activity 

The relative ability of drug—receptor complexes to produce 
maximum functional responses; a high-efficacy agonist 
needs to occupy fewer receptors to produce a maximal 
response than a low-efficacy agonist (also known as a 
partial agonist). 


Retrograde synaptic messengers 

Compounds that are released by a postsynaptic dendrite 
or cell body, but that act presynaptically — for example, to 
influence the release of a transmitter. 


Structure—activity relationship 
(SAR). The relationship between the pharmacological 
activity of compounds and their chemical structures. 


Transient receptor potential cation channel 
subfamily V member 1 

(TRPV1). A member of a superfamily of transmembrane 
cation channels; it was previously known as vanilloid 
receptor 1. 
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versus those of exogenously introduced 
cannabinoids may differ and have different 
physiological consequences ****. It is also 
recognized that many cannabinoid receptor 
ligands also interact with a wide range of 
non-cannabinoid receptor targets and that, 
irrespective of whether they are endogenous, 
synthetic or plant cannabinoids, the pharma- 
cological profiles of these compounds often 
vary considerably from each other“. 

The endocannabinoid receptors, the 
endocannabinoids and their biosynthetic and 
biodegrading enzymes constitute what has 
come to be known as the endocannabinoid 
system, the discovery of which prompted a 
search for its physiological and pathophysi- 
ological roles. This search revealed that there 
are several disorders in which endocan- 
nabinoids are released to their receptors in 
an ‘autoprotective’ manner that ameliorates 
unwanted effects of these disorders**™. It 
also raised the possibility that increasing 
extracellular levels of a released endocan- 
nabinoid by inhibiting metabolizing enzymes 
such as FAAH or MAGL, or by inhibiting 
the cellular uptake of anandamide, might 
prove to be an effective therapeutic strategy 
to manage some of these disorders, which 
include multiple sclerosis, Parkinson's dis- 
ease, schizophrenia, hypertension, inflam- 
matory bowel diseases, pruritus, Alzheimer’s 
disease, depression, obsessive compulsive 
disorder and cancer. 

The discovery of the endocannabinoid 
system also led to a reinvestigation of the 
interactions of plant and synthetic cannabi- 
noids with this system and other biochemical 
entities. As a result, evidence has emerged 
that A°-THC targets receptors other than CB1 
(REFS 85-87). For example, at submicromolar 
concentrations, A°-THC has also been found 
to have several effects: first, it has been found 
to activate CB2, albeit with less efficacy than 
it activates CB1 (REF. 88); second, it has been 
found to activate the deorphanized GPCRs 
GPR18 (REF. 89) and GPR55 (REF. 33), the 
cation channels TRPA1 and TRPV2 (REFS 90) 
and the nuclear receptor peroxisome prolifer- 
ator-activated receptor-y (PPARy)”; third, it 
has been found to block the activation both of 
5-hydroxytryptamine 3 (SHT3) ligand-gated 
ion channels**** and of TRPM8 cation chan- 
nels”; and, last, it has been found to enhance 
the activation both of al subunits and a1 61 
dimers of human glycine ligand-gated ion 
channels and of native glycine receptors in 
rat isolated ventral tegmental area neurons*. 
There have also been reports that submicro- 
molar concentrations of A*-THC can inhibit 
the enzyme lysophosphatidylcholine acyl 
transferase"', that it can increase the activity 
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of phospholipase C, which can catalyse the 
production of DAG and phospholipase A2 
(REF. 11) and that it can both inhibit the uptake 
of adenosine by cultured microglia and mac- 
rophages and affect the synaptosomal uptake 
of 5-hydroxytryptamine (it inhibits this pro- 
cess), of noradrenaline (it enhances this pro- 
cess) and of dopamine (it both enhances and 
inhibits this process)***”. In addition, at higher 
concentrations, A°-THC has been found to 
affect several other such pharmacological 
targets**”. For example, at concentrations 
between 1M and 10M, it has been reported 
to enhance the activation of §-adrenoceptors, 
to function as a negative allosteric modulator 
of u- and 6-opioid receptors, to activate the 
cation channels TRPV3 and TRPV4 and to 
inhibit T-type calcium (Ca,3) and potassium 
(K,1.2) voltage-gated ion channels, as well as 
conductance in Na* voltage-gated ion chan- 
nels. In this concentration range, A°-THC 

has also been reported to inhibit the enzymes 
lipoxygenase, Na*-K*-ATPase and monoam- 
ine oxidase, as well as the cytochrome P450 
enzymes CYP1A1, CYP1A2, CYP2Bé6 and 
CYP2C9, to inhibit noradrenaline-induced 
melatonin biosynthesis, and to activate or to 
inhibit Mg**-ATPase**”. 


Perspectives 

There has been much progress in our under- 
standing of the plant cannabinoids and of 
CB1 and CB2. We have identified endogenous 
lipid mediators that act on these receptors to 
regulate multiple pathways of cellular signal- 
ling. We have discovered synthetic agonists 
and antagonists for these receptors as well as 
allosteric modulators of CB1. However, there 
is still much more knowledge to be gained 
and challenges to be met in the fields of can- 
nabinoid receptor neuroscience, pharmacol- 
ogy, molecular biology and cannabinoid 
medicine. 

We now need to understand how the 
endocannabinoid receptors interact with 
other proteins in complexes that regulate dif- 
ferentiated functions both at the cell surface 
and in intracellular organelles, particularly in 
the brain’. 

Dozens of endogenous molecules, with 
structures that resemble those of the endo- 
cannabinoids, have been discovered in the 
brain®*”>. The activity of most of these mol- 
ecules is not known. Some of those that have 
been investigated show activities that have 
therapeutic potential; for example, arachi- 
donoy]l serine is a vasodilator” and is neu- 
roprotective after brain injury as it reduces 
apoptosis”. It leads to proliferation of neural 
progenitor cells in vitro and maintains these 
cells in an undifferentiated state in vitro and 


in vivo. Although it does not bind to CB1 and 
CB2, its activity is blocked by CB2 antago- 
nists**. This raises questions, such as what is 
the relationship of such endocannabinoid-like 
compounds to the endocannabinoid system 
and what are the physiological roles of these 
molecules in the brain? 

Pucci et al.” have investigated the possible 
epigenetic regulation of skin differentia- 
tion genes by phytocannabinoids”. CBD 
was found to increase DNA methylation of 
the keratin 10 gene. Remarkably, CBD also 
reduced keratin 10 mRNA levels by a CB1- 
dependent mechanism. Thus, in this system, 
CBD is apparently a transcriptional repressor 
that can control cell proliferation and dif- 
ferentiation. As anandamide has also been 
found to have epigenetic properties’, it is 
of interest to determine the extent, if any, of 
transcriptional control by endocannabinoids 
by epigenetic mechanisms. 

Although various methods have been 
used to enhance endocannabinoid levels 
in vivo (even in patients)**”", neither anan- 
damide nor 2-AG have been administered to 
humans. In addition, only a small number of 
clinical studies have been carried out using 
plant cannabinoids. A notable exception is 
the recent successful clinical trial using CBD 
in schizophrenic patients’. Although it is 
widely mentioned in the general media that 
cannabis with a high concentration of CBD 
is therapeutic in paediatric epilepsy and that 
‘medical marijuana’ is indeed of value in 
such cases’, there have not been any recent 
clinical trials reported, although several such 
trials are ongoing (an anti-epileptic trial of 
CBD in adults was reported 34 years ago’). 

Ina recent review, Pacher and Kunos™ 
suggested that “modulating endocannabi- 
noid system activity may have therapeutic 
potential in almost all diseases affecting 
humans”. They supported this strong state- 
ment with a long list of examples, although 
these examples were mostly obtained 
in vitro or from in vivo experiments in ani- 
mals™. If this summary of effects is shown 
to reflect actions in human patients, is the 
endocannabinoid system going to bring a 
revolution in therapy? This might be the 
case as investigators are now able to target 
multiple cell-specific synthetic and biotrans- 
formation enzyme pathways that can adjust 
the levels of endocannabinoid ligands with 
some degree of tissue selectivity. In addi- 
tion, aside from the agonist and antagonist 
ligands for cannabinoid receptors, research- 
ers can now target cell type-specific allos- 
teric modulators and receptor-associated 
proteins. Thus, there is great promise for the 
future of cannabinoid research. 
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Genetic predisposition to schizophrenia associated with 


increased use of cannabis 


RA Power'?, KJH Verweij?, M Zuhair', GW Montgomery*, AK Henders*, AC Heath®, PAF Madden®, SE Medland*, NR Wray” and 


NG Martin* 


Cannabis is the most commonly used illicit drug worldwide. With debate surrounding the legalization and control of use, 
investigating its health risks has become a pressing area of research. One established association is that between cannabis use and 
schizophrenia, a debilitating psychiatric disorder affecting ~ 1% of the population over their lifetime. Although considerable 
evidence implicates cannabis use as a component cause of schizophrenia, it remains unclear whether this is entirely due to 
cannabis directly raising risk of psychosis, or whether the same genes that increases psychosis risk may also increase risk of 
cannabis use. In a sample of 2082 healthy individuals, we show an association between an individual’s burden of schizophrenia risk 
alleles and use of cannabis. This was significant both for comparing those who have ever versus never used cannabis 

(P=2.6x 10“), and for quantity of use within users (P=3.0 x 1073). Although directly predicting only a small amount of the 
variance in cannabis use, these findings suggest that part of the association between schizophrenia and cannabis is due to a shared 
genetic aetiology. This form of gene-environment correlation is an important consideration when calculating the impact of 


environmental risk factors, including cannabis use. 
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INTRODUCTION 


During the last quarter of the 20th century, cannabis use has 
increased to become the most widely used illicit drug in the 
world.’ It is well established that cannabis use is much higher 
among schizophrenic patients than in the general population.” 
Cannabis intoxication can lead to an acute transient psychotic 
episode and produce short-term exacerbations of pre-existing 
psychotic symptoms,*~> an association that has been confirmed 
through the experimental administration of tetrahydrocanna- 
binol.°” Meta-analyses of prospective studies have found that 
cannabis use increases the likelihood of developing a psychotic 
illness by a factor of roughly two.2'' A dose response effect has 
been demonstrated,'?"'* and use in adolescence has been associ- 
ated with the greatest risk.'° Given the large health burden from 
schizophrenia and other psychotic disorders,'© the view that 
cannabis use is a component cause of schizophrenia has heavily 
influenced discussion over the legislation surrounding cannabis use. 

However, the relationship between schizophrenia and cannabis 
use may be more complicated than it initially seems. Despite a 
clear association between the two, the possibility of reverse 
causation has not been entirely excluded. Some small studies have 
suggested that it is in fact psychosis that is a risk factor for 
cannabis use, as those on a psychotic spectrum are more likely to 
experiment with drugs.'”'® The strongest evidence comes from 
Ferdinand et al.'? who found that the association was bidirec- 
tional, as cannabis-naive children with prodromal psychotic 
episodes had greater incidence of later cannabis use. However, 
a similarly sized study failed to replicate this finding.*° There is 
also the possibility of attempts by patients at self-medication, as it 


has been suggested that cannabis use can reduce negative and 
affective symptoms in patients with an established psychotic 
disorder.”' 7? 

The issue is further complicated by tentative evidence for 
interactions between cannabis use and genetic risk variants for 
schizophrenia.** Schizophrenia is known to be highly heritable 
with up to 80% of the variance explained by additive genetic 
effects,?° and as sample sizes have increased a growing number of 
genetic risk variants have been identified.2°?” Interactions 
between risk variants and cannabis use might explain why some 
individuals experience psychosis while others do not. However, 
cannabis use itself has been reported to be heritable,72-3° 
although no genetic risk variants have been identified.*' It is 
unclear to what extent the heritability of cannabis use results from 
shared heritability with other behavioural phenotypes such as 
schizophrenia predicting its use. 

Here we test for such genetic overlap directly, and aim to 
discern the direction of causation between cannabis use and 
schizophrenia. Within a sample of 2082 healthy individuals, we 
tested to see whether polygenic risk scores for schizophrenia 
predict cannabis use. Polygenic risk scores reflect the cumulative 
burden of risk alleles carried by an individual as identified in a 
previous genome-wide association study (GWAS),2? here of 13 833 
schizophrenia cases and 18310 controls.?” Such an association 
with cannabis use would suggest that those genetically predis- 
posed to schizophrenia use cannabis more frequently. This would 
mean that the association between schizophrenia and cannabis 
use is not simply one of an environmental risk factor, but 
rather involves gene-environment correlation, as individuals 
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choose and shape their own environment based on their own 
innate preferences. 


MATERIALS AND METHODS 


The data used in this study come from the Australian Twin Registry. Data 
were obtained from two studies in which twins and their families 
participated in semi-structured diagnostic telephone interviews aimed 
primarily at assessing psychiatric health. Informed consent was obtained 
from all participants. 

Sample 1 consisted of 6265 individuals aged between 23 and 39 years 
(mean = 29.9 + 2.5) interviewed between 1996 and 2000. Participants were 
members of the young adult cohort, a volunteer panel of twins born 
between 1964 and 1971. The interview was based on a modified version of 
the SSAGA (Semi-Structured Assessment of the Genetics of Alcoholism). 
Detailed information about the sample recruitment, the study procedure 
and the measures can be found elsewhere.** Sample 2 comprised 9688 
individuals aged between 18 and 91 years (mean = 46.3 + 11.3) interviewed 
between 2001 and 2005. Participants were members of the older and 
younger adult cohort of Australian twin pairs (born between 1895 and 
1964, and between 1964 and 1971, respectively). A subset of this sample 
was ascertained based on large sibship size, or having a relative with 
nicotine or alcohol dependence. The interview used for this sample was 
also based on a modified version of the SSAGA. Further details about the 
sample and assessment can be found in Heath et al.>° 

A subset of the participants (V=1866; 11.7%) participated in both 
studies, in which case we used data from the last assessment. The 
combined phenotypic sample consisted of 14087 individuals, of whom 
7172 were genotyped. In both studies, twins were asked the same items 
about cannabis use: (1) did you ever use marijuana?, (2) how old were you 
the very first time you tried marijuana (not counting the times you took it 
as prescribed)? and (3) how many times in your life have you used 
marijuana (do not count times when you used a drug prescribed for you 
and took the prescribed dose). Ever use was measured on a dichotomous 
scale (ever versus never), whereas age at initiation and quantity of use 
were open questions. Table 1 shows the prevalence of cannabis use for 
individuals included in the present study. 

Genotype data were obtained using three different Illumina single nucleotide 
polygmorphism (SNP) genotyping platforms (317K, HumanCNV370- 
Quadv3, Human CNV370v1 and Human610-Quad). Standard quality control 
procedures were applied as outlined previously,° including checks for 
ancestry outliers, Hardy-Weinberg equilibrium (P< 10~°), Mendelian errors, 
call rate, genotypic missingness (>5%), individual missingness (>5%) and 
minor allele frequency (< 0.01). Individuals were pruned on relatedness, 
removing one individual from each pair with relatedness >0.05, as 
determined from genetic data. The final sample therefore comprised 
2082 ‘unrelated’ individuals (see Table 1 for sample details). 

Polygenic risk scores were constructed using the P-values and logjo 
odds ratios from the most recent large GWAS of schizophrenia, a meta- 
analysis of the Psychiatric Genomics Consortium’s studies with additional 
Swedish samples totalling 13 833 cases and 18 310 controls.*” SNPs were 
pruned for linkage disequilibrium using P-value informed clumping in 
PLINK,?” using a cutoff of R?=0.25 within 200kb window. The major 
histocompatibility complex region of the genome was excluded, due to its 
complex linkage disequilibrium structure. After linkage disequilibrium 
pruning, 147 830 SNPs remained. Multiple scores were generated for each 
individual using the PLINK score option and based on top SNPs from the 
schizophrenia GWAS using varying significance thresholds (P=0.0001, 
0.001, 0.01. 0.05, 0.1, 0.2, 0.3, 0.4, 0.5 and 1.0). Polygenic risk scores were 
tested for association with a binary ever versus never used cannabis and 
two quantitative traits for quantity of use and age at first use, in logistic 
and linear regressions, respectively. These analyses were corrected for the 


Table 1. Summary statistics of sample for cannabis use traits 

Users Non-users 
N 1011 1071 
Mean age (s.e.) 41.3 (0.23) 53.0 (0.37) 
Percentage female (%) 46.5 56.0 
Mean age at initiation (s.e.) 19.6 (0.06) — 
Mean number of uses over lifetime (s.e.) 62.7 (4.56) _ 
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first 10 ancestry-informative principal components, genotyping platform, 
sex, age, age squared and sex by age. Analysis was performed in STATA.*® 


RESULTS 

After pruning, 2082 unrelated individuals remained in our sample 
with both genotype and phenotype measures. Within the sample, 
1011 individuals (48.6%) had ever used cannabis, of whom 997 
had data on quantity of use. Mean number of usages of cannabis 
over lifetime was 62.7 (95% Cl 53.8-71.6), and the mean age of 
initiation of use was 20.1 (95% Cl 19.7-20.5). Males showed higher 
rates of use than females, 53.5% compared with 43.9% (P < 0.001), 
although no significant difference in age at initiation. Table 1 
shows the summary statistics for the sample. 

Polygenic risk scores for schizophrenia showed positive 
associations for ever versus never use of cannabis across all 
P-value thresholds, with the strongest association for those SNPs 
with P-values of 0.01 or below in the original schizophrenia GWAS 
(see Figure 1, R*=0.47%, P=2.6x 107“). Significant associations 
were also seen in the analysis of quantity of cannabis use for 9 of 
the 10 SNP cutoffs, with the top association seen for those SNPs 
with P< 0.05 for schizophrenia (R? = 0.85%, P= 0.003). No associa- 
tion was seen with age at initiation of use, although the 
association with quantity of use remained significant when 
number of years of usage was accounted for (results not shown). 

As a secondary analysis, polygenic risk score for schizophrenia 
risk alleles with P<0.01 (the threshold with the greatest 
association in the primary analysis) was examined within 990 
twin pairs (608 dizygotic and 382 monozygotic) where data on 
cannabis use of both twins was available. Taking the mean 
polygenic risk score within each twin pair, an ordinal regression 
was performed to predict whether neither (n = 272), one (n= 273) 
or both twins (n=445) were cannabis users. After correcting for 
age, sex and zygosity, a significant association was observed 
(P=0.001). Those twin pairs where both reported using cannabis 
had the greatest burden of schizophrenia risk alleles, pairs with 
only one user were found to have an intermediate level and the 


Schizophrenia polygene scores and cannabis use 
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Figure 1. Results of polygenic risk scores for schizophrenia predict- 


ing variance explained (R*) in cannabis use as both a binary trait of 
ever versus never, and as a quantitative trait of lifetime use within 
only users. Polygenic scores were created using different cutoffs for 
the inclusion of risk variants for schizophrenia, ranging from 
P=0.0001 to 1.0. 
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Figure 2. Mean standardized schizophrenia polygenic risk scores for 
pairs of twins when neither (n= 272), one (n=273) or both twins 
(n=445) had reported use of cannabis. An ordinal regression 
reported a significant association (P=0.001). 


lowest burden was found in pairs where neither twin reported use 
(see Figure 2). 


DISCUSSION 


Our results show that to some extent the association between 
cannabis and schizophrenia is due to a shared genetic aetiology 
across common variants. They suggest that individuals with an 
increased genetic predisposition to schizophrenia are both more 
likely to use cannabis and to use it in greater quantities. This is not 
to say that there is no causal relationship between use of cannabis 
and risk of schizophrenia, but it does establish that at least part of 
the association may be due to causal relationship in the opposite 
direction. Although the variance in cannabis use explained by 
schizophrenia polygenic risk scores is small, it is in line with other 
cross-phenotype analyses, largely due to the polygenic risk scores 
for schizophrenia predicting only ~7% of the variation for 
schizophrenia itself. Previous associations between polygenic risk 
scores for schizophrenia and other psychiatric illnesses, such as 
bipolar disorder, major depression and autism,*? have shown 
effects of similar sizes. Further research will be needed to see 
whether the genetic overlap observed here is specific to cannabis 
use or is present across illicit drug use and addiction phenotypes, 
data for which was not widely available in this sample. For now, 
these findings have important implications for the current 
perception of cannabis use as a risk factor for schizophrenia, 
and other psychotic disorders. 

However, it is worth noting that this association, if true, does 
not rule out the possibility of cannabis independently being a risk 
factor for schizophrenia. A bidirectional association between 
cannabis use and psychosis has previously been suggested.*° 
Further, one caveat to interpreting the direction of causation 
concerns the discovery sample used to identify schizophrenia risk 
alleles. The schizophrenia GWAS sample will likely include many 
more cannabis users among cases than controls. This may lead to 
an excess of causal SNPs associated with cannabis use, as opposed 
to schizophrenia itself, identified as schizophrenia risk alleles. Only 
if the discovery schizophrenia sample was comprised entirely of 
non-cannabis users could causation be inferred without any risk of 
confounding. This is an important consideration as to whether 
polygenic risk scores overestimate individuals’ un-modifiable 
genetic risk by including their genetic predisposition to modifiable 
environmental risk factors. 

These results highlight the blurring between behavioural 
phenotypes and environment, and have wider implications for 
how we perceive supposedly environmental risks for disease. 
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Individuals select their own environments based on their innate 
and learned preferences, and have their environments react to 
their own behaviour. Further, parents pass down both genes and 
environment to their children. All of these can contribute to 
gene-environment correlation, particularly with respect to beha- 
vioural traits. Several studies have shown that supposedly environ- 
mental risk factors such as urbanicity, religiosity and stressful life 
events have heritable components to them.*' *? The existence of 
heritability for supposedly environmental risk factors does not 
mean they are inevitable, only that causality is more complicated 
to discern. Future studies will need to explore the matching of 
cases and controls on environmental risk variants to fully 
disentangle causation. This can be supplemented exploring the 
generation of polygenic risk scores for environmental risk factors, 
and their role in predicting disease status. The wider availability of 
genetic data in richly phenotyped samples should allow for the 
integration of genetics into an epidemiological framework, and so 
the discovery of gene-environment correlations where they exist. 

With ongoing debate over the legalization of cannabis and the 
potential health risks it poses, understanding the association 
between its use and schizophrenia is a priority. It has previously 
been suggested that, even assuming an entirely causal relation- 
ship, the required reduction in the number of cannabis users to 
prevent one case of schizophrenia is in the thousands.** 
Our findings here highlight the possibility that this association 
might be bidirectional in causation, and that the risks of cannabis 
use could be overestimated. This is an important subtlety to 
consider when calculating the economic and health impact of 
cannabis use. 
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The use of recreational and medical marijuana is increasingly accepted by the general public in the United States. 
Along with growing interest in marijuana use has come an understanding of marijuana’s effects on normal physiology 
and disease, primarily through elucidation of the human endocannabinoid system. Scientific inquiry into this system 
has indicated potential roles for marijuana in the modulation of gastrointestinal symptoms and disease. Some patients 
with gastrointestinal disorders already turn to marijuana for symptomatic relief, often without a clear understanding of 
the risks and benefits of marijuana for their condition. Unfortunately, that lack of understanding is shared by health- 
care providers. Marijuana’s federal legal status as a Schedule | controlled substance has limited clinical investigation 
of its effects. There are also potential legal ramifications for physicians who provide recommendations for marijuana 
for their patients. Despite these constraints, as an increasing number of patients consider marijuana as a potential 
therapy for their digestive disorders, health-care providers will be asked to discuss the issues surrounding medical 


marijuana with their patients. 
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Archaeological records indicate that marijuana, also known as 
cannabis, has been cultivated and used for its psychoactive and 
medicinal properties for over 2700 years (1). With the Controlled 
Substances Act of 1971, however, marijuana was classified in the 
United States as a Schedule I substance with no accepted medical use 
and high potential for abuse, similar to heroin (2). Despite federal 
prohibition, the use of marijuana for recreational and medicinal pur- 
poses continues to increase (3,4). Following the 2013 election cycle, 
medical marijuana programs exist in 21 states as well as Washington 
DC. Furthermore, the recreational use of marijuana was legalized in 
both Colorado and the state of Washington in 2012 and it has been 
decriminalized in 15 additional states (Figure 1). With these devel- 
opments, medical professionals who care for patients with digestive 
disorders are increasingly faced with questions about the therapeutic 
role of marijuana and, in some cases, are asked to provide documen- 
tation to support a request for medical marijuana. This brief review 
is intended to help inform those discussions. 


Phytocannabinoids and the endocannabinoid system 
Marijuana is the common name for the Cannabis plant, from 
which nearly 500 different chemical compounds have been 


isolated (5). Among these, the most clinically relevant are the phy- 
tocannabinoids that are concentrated in the plant's flowering buds 
that are harvested for consumption. The vast majority of interest 
has focused on A9-tetrahydrocannabinol (THC), which is pri- 
marily responsible for the psychoactive effects of marijuana (6). 
Marijuana also contains ~70 other phytocannabinoids, such as 
cannabidiol (CBD), that are present in varying ratios when com- 
pared with THC content and seem to have minimal psychotropic 
effects (5,7). There are two main subspecies of the Cannabis plant: 
Cannabis sativa and Cannabis indica. Sativa-dominant strains 
have higher THC content than indica strains, in which the CBD 
content is higher (8). 

Scientific interest in the medical application of marijuana- 
based compounds heightened in the early 1990s with the dis- 
covery of an endogenous cannabinoid signaling system, termed 
the endocannabinoid system, through which phytocannabinoids 
appear to signal. The endocannabinoid system has since been 
implicated in diverse physiologic processes (9). It includes two G 
protein-coupled cannabinoid receptors, CB1 and CB2, as well as 
two endogenous ligands or endocannabinoids: anandamide and 
2-arachidonylglycerol (10). Generally, CB1 is found in great- 
est abundance in central and peripheral neurons, whereas CB2 
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Figure 1. State-level marijuana regulation status. 


is expressed predominantly in immune cells (11,12). In the gas- 
trointestinal tract, CB1 receptors are expressed principally in 
the enteric nervous system with high concentration within cho- 
linergic neurons of the myenteric and submucosal plexus where 
receptors are thought to promote an inhibitory effect on motil- 
ity and secretory function via reduced acetylcholine release (13). 
CB1 receptors have also been identified in normal human colonic 
epithelium and smooth muscle, and CB1 receptor activation has 
been shown to enhance epithelial wound healing. CB2 receptor 
expression appears to be more pronounced in inflamed colonic 
epithelium and lamina propria immune cells, and there is in vitro 
evidence to suggest that activation of epithelial CB2 receptors by 
cannabinoids inhibits tumor necrosis factor-c.-induced interleu- 
kin-8 release (14,15). Expression of cannabinoid receptors is very 
limited in the normal liver but is increased in experimental liver 
injury and cirrhosis. CB1 and CB2 receptor activation have been 
shown to induce pro- and anti-fibrogenic effects, respectively. In 
the normal pancreas, there is weak expression of CB1 and CB2 
that increases in the setting of inflammation; however, there have 
been conflicting studies on the impact of cannabinoid receptor 
activation on experimental pancreatitis. There is also evidence 
that cannabinoid agonists have apoptotic, antiproliferative, and 
antimetastatic effects in several gastrointestinal cancer cell lines 
and animal models (16). 

THC is a partial agonist of both CB1 and CB2 but has higher 
affinity for CB1, which appears to mediate its psychoactive proper- 
ties. CBD has much weaker affinity for cannabinoid receptors but 
has demonstrated anti-inflammatory properties that may occur 
through CB2 inverse agonism or independently of cannabinoid 
receptors altogether (17). Several synthetic cannabinoid com- 
pounds have been developed to try to modulate the endocannabi- 
noid system for therapeutic purposes. Two have been approved by 
the Food and Drug Administration (FDA). Dronabinol (Marinol) 
is a synthetic THC that is indicated for chemotherapy-induced 
nausea and vomiting as well as AIDS anorexia. Another THC 
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analog, known as nabilone (Cesamet), was approved for nausea 
and vomiting after cancer chemotherapy unresponsive to typical 
antiemetics. A third commercial medication, nabiximols (Sativex), 
is an oromucosal spray with relatively equal amounts of THC and 
CBD. It is approved in 13 countries, including Canada and the 
United Kingdom, for use in patients with cancer pain, neuropathic 
pain, or spasticity due to multiple sclerosis. Rimonabant (Acom- 
plia) is a selective CB1 antagonist that was marketed for weight loss 
and used for smoking cessation but has been withdrawn because 
of adverse psychiatric effects—primarily depression. Numerous 
other CB1 and CB2 agonists and antagonists are currently under- 
going investigation, including a phase II study in Europe of an oral 
therapy for ulcerative colitis that contains CBD and THC in a ratio 
of 20:1 (18,19). 


Practical aspects and consequences of medical marijuana use 
The psychotropic and physiologic effects of marijuana can 
vary greatly for different individuals depending on the route of 
administration, the relative dosage of THC and other phytocan- 
nabinoids, and the chronicity of use (20). An array of medical 
marijuana products is now available, including many different 
edible forms—from brownies and honey to barbeque sauce and 
soda—as well as very potent concentrates. Still, the most common 
method of consumption remains smoking that is done through 
a variety of delivery devices. These include marijuana cigarettes 
(“joints”), pipes (“bowls”), or water pipes with a chamber for 
water filtration (“bongs”). Vaporizers are often used to heat mari- 
juana to a temperature sufficient to evaporate cannabinoids but 
not burn plant material, resulting in limited inhalation of tar and 
irritants (21). Other new inhalation methods, such as water pipes 
with carbon filtration systems, appear to reduce exposure to pes- 
ticides present in cultivated marijuana (22). 

Product labeling and testing requirements are not standard- 
ized across states but they commonly include information about 
pesticides and contaminants as well as THC potency. State- 
licensed testing facilities often also provide data on the percentage 
of other non-THC phytocannabinoids such as CBD, cannabinol, 
cannabigerol, cannabichromene, and _ tetrahydrocannabivarin. 
Despite a lack of clinical data, statements regarding the eff- 
cacy for these compounds are often communicated to would-be 
buyers (23,24). The relative cannabinoid composition and 
THC content of marijuana products varies greatly, with THC 
content typically in the range of 0.5 to 5%; however, it has been 
increasing over time and concentrates can contain over 50% 
THC (25,26). 

Following inhalation, the psychotropic effects of marijuana gen- 
erally start within a minute, peak within a half hour, and begin 
tapering within 2-3h. Following oral ingestion, physiological 
effects typically begin after 30-90 min, peak after 2-3h, and can 
last 4-12h, depending on dose and specific effect (27). Notably, 
this delayed onset can lead to difficulty in regulating dosage to 
achieve a desired effect in real time. 

Acute adverse effects include anxiety and psychotic symp- 
toms. The most concerning adverse effects of chronic use include 
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Figure 2. Drug use, dependency, and mortality statistics for the United States. The proportion of US individuals aged 15 to 54 years who (a) ever used 
common drugs (31), and (b) the proportion of drug users who ever became dependent (31). (¢) Annual deaths in the United States attributable to tobacco 
(32), alcohol (33), cocaine (34), heroin (34), anxiolytics (34), and marijuana (35). 


increased risk of motor vehicle crashes, decreased fertility, altered 
adolescent psychosocial development and mental health, as well 
as a hyperemesis syndrome characterized by cyclic episodes of 
nausea and vomiting along with a learned behavior of frequent 
hot bathing to relieve symptoms (28-31). Smoked crude mari- 
juana contains similar carcinogens as tobacco smoke, with up to 
three times the tar content; however, long-term daily use of small 
amounts of marijuana does not appear to affect pulmonary func- 
tion, and marijuana use has been associated with neither lung can- 
cer nor head and neck cancer (32-34). With that said, it should 
be noted that the illegal federal status of marijuana has enabled 
far less rigorous safety evaluation for marijuana than for tobacco. 
There is mounting evidence of marijuana dependence as well as 
a withdrawal syndrome that is similar to tobacco withdrawal and 
includes irritability, sleep disturbance, anorexia, and depressed 
mood (35). A fatal dose of THC is estimated at 15-70g based 
on studies in rodents; however, this is much higher than can be 
reached by heavy marijuana users and no deaths have been solely 
attributed to the use of marijuana (Figure 2) (28,36-41). Given 
the many variables involved, the actual dose of THC delivered to a 
user is not easily quantified and some experts suggest that patients 
should optimize their own dose of marijuana to achieve a desired 
effect rather than follow a prescription (42). 


Efficacy for digestive disorders 

Numerous preclinical studies indicate that endocannabinoids 
are involved in many functions in the digestive system, includ- 
ing gastric acid production, nausea and emesis, food intake, vis- 
ceral sensation, gastrointestinal motility, hepatic fibrogenesis, 
and intestinal inflammation (10). Although modulation of the 
endocannabinoid system has shown therapeutic potential in 
a variety of experimental models of gastrointestinal disease 
(16,43-48), a vast majority of data from controlled human stud- 
ies relate to synthetic cannabinoids. There is very limited clinical 
evidence demonstrating either a beneficial or detrimental effect 
of medical marijuana for digestive disorders. Through a sys- 
tematic search of the existing scientific literature, we identified 
only five randomized controlled trials evaluating the impact of 
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marijuana on gastrointestinal function, symptoms, or disease 
(Table 1) (49-53). 


Nausea and vomiting. The oral THC analogs (dronabinol and 
nabilone) have compared favorably with dopamine antagonists 
for chemotherapy-induced nausea and vomiting, although there 
are little data comparing them with 5-HT, antagonists or the NK, 
antagonist, aprepitant (20,54-56). Unlike chemotherapy-induced 
vomiting, the associated nausea has been less responsive to 
current first-line therapies, and considerable preclinical evidence 
indicates that manipulation of the endocannabinoid system may 
be beneficial in some cases (57). Although many patients have a 
strong preference for smoked marijuana as a therapy for nausea, 
there are no controlled studies evaluating the antiemetic effects of 
marijuana (54). 


Appetite stimulation and weight gain. Marijuana is known 
as an appetite stimulant. This impression has been borne out 
by studies demonstrating that healthy subjects who smoke 
marijuana have higher food consumption, caloric intake, and 
body weight (49,58). Nevertheless, among cancer patients with 
anorexia and/or cachexia, there are mixed data from controlled 
trials to suggest a modest benefit, at best, from synthetic THC 
(dronabinol) (59-65). For patients with AIDS-related anorexia, 
there is evidence to suggest that dronabinol improves anorexia 
and, at higher doses, leads to weight gain, although results have 
varied (50,52,66-68). Two small, placebo-controlled studies in 
this population demonstrated that smoking marijuana (2-4% 
THC) three times per week increased food intake and body 
weight, with greater increases if marijuana was smoked four 
times daily (50,52). There are currently no data from controlled 
studies to indicate a significant benefit from ingested marijuana 
extract (69) or smoked marijuana for cancer-related anorexia 
and/or cachexia. 


Hepatitis C virus infection. Daily marijuana use among 
patients with hepatitis C virus infection has been associated 
with increased steatosis and fibrosis (70-72). Conversely, 
two relatively small studies evaluated the impact of synthetic 
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Table 1. Randomized controlled trials evaluating the impact of marijuana on gastrointestinal function, symptoms, or disease 


Function or Treatment 


disorder studied 


First author Year 


Foltin 1988 Appetite Smoked marijuana (2.3% 
THC); placebo 

Haney 2005 HIV Smoked marijuana (1.8, 
2.9, 3.9% THC); dronabinol; 
placebo 

Strasser 2006 Cancer-related Oral cannabis extract 

anorexia—cachexia (2.5mg THC, 1mg CBD); 

oral THC; placebo 

Haney 2007 HIV Smoked marijuana (2.0, 
3.9% THC); dronabinol; 
placebo 

Naftali 2013 Crohn's disease Cannabis sativa cigarette 


(23% THC, 0.5% CBD); 
placebo 


CBD, cannabidiol; THC, tetrahydrocannabinol. 
See Appendix for systematic review methodology. 


cannabinoid or marijuana use during interferon-based therapy 
and suggested that they may increase adherence and sustained 
virologic response, presumably through a reduction in treatment- 
associated symptoms (73,74). 


Inflammatory bowel disease (IBD). Patients with both ulcerative 
colitis and Crohn’s disease commonly use marijuana for relief of 
symptoms (75). The majority of IBD patients report that marijuana 
provides significant benefit for poor appetite, nausea, and 
abdominal pain; however, improvement in diarrhea is less clear 
(76). Notably, ifit were available legally, over half of nonusers have 
reported an interest in using marijuana for IBD symptom relief 
(76). Small retrospective and prospective observational studies 
of patients with Crohn’s disease who smoke medical marijuana 
that contains moderate concentrations of THC indicate that 
medical marijuana has beneficial effects on symptom-based 
disease activity indices, overall well-being, and steroid use 
(77,78). A small randomized controlled trial of marijuana 
smoked twice daily (115mg THC; 23% THC and <0.5% CBD) 
also recently demonstrated a significant response as measured 
by the Crohn's Disease Activity Index (53). Endoscopic disease 
activity was not evaluated, but lack of reduction in C-reactive 
protein indicates that the inflammatory disease burden was not 
affected. Although patients with IBD may feel better while using 
marijuana, evidence for an objective improvement in disease 
activity is lacking. 


Abdominal pain. Preclinical evidence indicates that the endo- 
cannabinoid system plays an important role in the modulation 
of pain. Although there is also significant evidence that synthetic 
cannabinoids and medical marijuana modulate chronic pain, 
especially chronic neuropathic pain, there are no controlled 


© 2015 by the American College of Gastroenterology 


Subjects Outcome Reference 

6 Healthy adult 40% Increase in daily caloric intake (49) 

males because of more frequent snacking with 
marijuana 

30 HIV+ patients Comparable increases in caloric intake (50) 
for marijuana and dronabinol over 
placebo 

164 Cancer patients No difference in appetite or quality-of-life (51) 
outcomes 

10 HIV+ patients Dose-dependent increase in caloric (52) 
intake and body weight for marijuana and 
dronabinol over placebo 

21 Patients with Significant clinical response but no (53) 


moderately active 
Crohn’s disease 


decrease in inflammatory markers with 
marijuana 


human studies specifically evaluating the efficacy of medical 
marijuana for chronic abdominal pain (20). 


Marijuana, physicians, and the law 

State medical marijuana laws vary widely with respect to the 
medical conditions for which marijuana is approved, the admin- 
istrative steps necessary for patients to legally obtain marijuana, 
and the role physicians must play in the process (Supplementary 
Table S1 online). Cancer and HIV/AIDS are considered qualify- 
ing medical conditions in all states with defined medical mari- 
juana programs. Many states also permit the use of marijuana for 
other conditions including nausea, cachexia, hepatitis C infection, 
and Crohn's disease. 

Most states that allow for medical marijuana have designed sys- 
tems that seem to protect physicians. Given the federal prohibi- 
tion against writing a prescription for medical marijuana, state 
laws commonly direct physicians to provide patients with docu- 
ments termed “certifications,, “recommendations,” or “referrals” 
for marijuana use. Generally, before providing certification of 
medical necessity for marijuana to a patient, a physician must have 
an active state medical license and have an ongoing therapeutic 
relationship with that patient, the definition of which varies by 
state. Some states also require specific certification to recommend 
medical marijuana (see Supplementary Table S1). Physicians are 
not allowed to have contact with marijuana dispensaries, to refer 
patients to a specific dispensary, to have a financial relationship 
with a dispensary, or to certify family members or themselves for 
marijuana use. 

Still, in addition to concerns about safety and efficacy, the 
federal legal status of marijuana as a Schedule I controlled sub- 
stance leads many physicians to be wary of recommending 
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medical marijuana to their patients. Currently, the possession, 
manufacture, and distribution of marijuana—even for purposes 
of medical treatment—remain a violation of the federal Con- 
trolled Substance Act. It is not entirely clear whether a physician 
would be liable under federal law for the act of recommending 
or prescribing marijuana for a patient's medical condition. In 
response to state initiatives that attempted to shield physicians 
who recommended the medicinal use of marijuana, the federal 
government declared a policy that “recommending or prescrib- 
ing Schedule I controlled substances” would lead to revocation 
of a physician's registration to prescribe controlled substances, 
exclusion from the Medicare/Medicaid programs, and criminal 
prosecution (79). A group of physicians and patients subsequently 
obtained a permanent injunction against enforcement of this 
policy (80). In the case of Conant vs. Walters, the Ninth Circuit 
Court of Appeals then upheld the injunction on the grounds that 
the federal government’s drug enforcement policy violates the 
First Amendment values of the physician-patient relationship 
(81). When the federal government then appealed the Conant 
decision to the US Supreme Court, it declined to hear the case at 
that time; it may still hear future appeals. 

For physicians in states within the Ninth Circuit's jurisdiction 
(see Figure 1), this means that the federal government continues 
to be forbidden from prosecuting physicians solely on the basis 
of their communication with patients regarding marijuana. How- 
ever, for physicians in other jurisdictions, Conant is not a binding 
precedent and the threat of federal prosecution for recommend- 
ing medical marijuana remains. Under the Obama administra- 
tion, the Department of Justice has indicated several times that 
federal resources will generally not be used to prosecute indi- 
viduals or caregivers who are in compliance with state marijuana 
laws (82,83). Nevertheless, the Department of Justice retains the 
authority to enforce federal marijuana laws. Although seemingly 
unlikely, it remains to be seen whether a new federal administra- 
tion will choose to enforce federal law and prosecute physicians 
who recommend medical marijuana, notwithstanding a state's law 
to the contrary. 


Concluding remarks 

It is increasingly clear that the endocannabinoid system plays a 
role in diverse biological pathways that affect gastrointestinal and 
hepatic physiology and pathology. However, due in large part to 
a dearth of high-quality human study data, the clinical efficacy 
of marijuana or its constituent phytocannabinoids for diges- 
tive disorders remains unclear. Although there are no studies 
proving the long-term safety of marijuana use, its safety profile 
compares favorably to other illicit substances, to legal intoxicants 
like alcohol, possibly to opioids, and to some existing therapies 
for digestive disorders. Claims of marijuana as a cure-all are 
clearly unfounded but, at the least, it appears to hold promise as a 
modifier of gastrointestinal symptoms. As medical marijuana use 
continues to grow in the United States, physicians must take the 
lead in understanding the risks and benefits in order to provide 
accurate information to patients. This understanding necessitates 
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further well-designed scientific inquiry that will require federal 
reclassification of marijuana from its current status as a Schedule 
I controlled substance. In the meantime, physicians are in a chal- 
lenging position between providing patients with their opinion 
and minimizing punitive consequences. 
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Study Highlights 


WHAT IS CURRENT KNOWLEDGE 


Despite federal prohibition, the use of marijuana for rec- 
reational and medicinal purposes continues to increase. 


Some patients with gastrointestinal disorders self-medicate 
with marijuana. 


Increasing marijuana acceptance is generating more ques- 
tions regarding its use for health conditions. 


WHAT IS NEW HERE 


A focused summary of the data to support the use of mari- 
juana for digestive disorders. 


An update on the legal considerations for healthcare 
providers who are considering recommending medical 
marijuana for patients. 
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LINE/PubMed database to identify randomized controlled clin- 
ical trials involving marijuana and gastrointestinal conditions. 
This was undertaken utilizing the search phrase: “cannabis OR 
marijuana OR tetrahydrocannabinol OR cannabidiol OR THC 
OR cannabinoid OR dronabinol OR nabilone OR nabiximols 
AND _____” Results were filtered for human studies and clini- 
cal trials and were hand-reviewed for relevance. The final search 
term was variable and included the following: esophagus, stom- 
ach, pancreas, gallbladder, biliary, liver, small intestine, colon, 
large intestine, rectum, anus, cancer, esophageal cancer (ade- 
nocarcinoma), stomach cancer (adenocarcinoma), carcinoid, 
liver cancer, hepatocellular carcinoma, hepatoma, pancreatic 
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